Comment by tuxdna on Transform a large dataframe - takes too long
@Amoxz here you go - gist.github.com/tuxdna/d6b16610e65e6c4ab05c70057518dbe8
View ArticleComment by tuxdna on Scala can't access Java inner class?
This incompatibility is bugging me now. Has anyone found an alternative approach other than reflection?
View ArticleComment by tuxdna on Execute command on all files in a directory
@HubertKario You may want to read more about -print0 for find and -0 for xargs which use null character instead of any whitespace ( including newlines ).
View ArticleComment by tuxdna on Merge two tables in Scala/Spark
@skan: Try this - stackoverflow.com/questions/8820778/linux-join-2-csv-files
View ArticleComment by tuxdna on Binary Search in large .txt with python (ordered by hash)
The first part ( before colon ) is clearly 20 byte sha1 hash, but what is :27 in your sample?
View ArticleComment by tuxdna on Binary Search in large .txt with python (ordered by hash)
your file must be accessible by the user running web-server
View ArticleComment by tuxdna on How can I scrape faster
Did you first try to benchmark how long it takes to process single json? Assuming it takes 300ms per record, you can process all of these records sequentially in about 5 days.
View ArticleComment by tuxdna on How do I calculate fuzz ratio between two columns?
First, how are you calculating fuzz ratio between two strings?
View ArticleComment by tuxdna on I don't understand why I get a "too many values to...
Updated with EDIT1 for your reference.
View ArticleComment by tuxdna on How to measure GPU memory usage of TensorFlow model
What is the memory consumption now?
View ArticleComment by tuxdna on How to resample a pandas data frame every half second...
This solution is neat!
View ArticleComment by tuxdna on How to efficient copy an Array to another in Scala?
Even though b is val which prevents us from reassigning it to some other value, Array is mutable.
View ArticleAnswer by tuxdna for reduce key, list(values) to key, value using scala
In the absence of concrete types in the question I have made an assumption that the values in your array are Char and Int tuples respectively. Here is how we can transform to the desired...
View ArticleAnswer by tuxdna for Primitive types to AnyRef in Scala
That is because, AnyRef is for objects and AnyVal is for primitives. You can use an Array[Any] in your case:var s: PreparedStatement = session.prepare("insert into test_person (name, age, point) values...
View ArticleAggregate Pandas DataFrame based on condition that uses multiple columns?
import pandas as pddata = {"K": ["A", "A", "B", "B", "B"],"LABEL": ["X123", "X123", "X21", "L31", "L31"],"VALUE": [1, 3, 1, 2, 5.0]}df = pd.DataFrame.from_dict(data)output = """ K LABEL VALUE0 A X12...
View ArticlePrettify JSON data using Ruby on the terminal
I have earlier used Python for doing pretty output of JSON data like this:python -mjson.tool input.jsonI wanted to get similar output using Ruby. I am doing it like this:ruby -rrubygems -e 'require...
View ArticleString range in Scala
In Ruby we can do this:$ irb>> ("aa".."bb").map { |x| x }=> ["aa", "ab", "ac", "ad", "ae", "af", "ag", "ah", "ai", "aj", "ak", "al", "am", "an", "ao", "ap", "aq", "ar", "as", "at", "au", "av",...
View ArticleR DataFrame - One Hot Encoding of column containing multiple terms [duplicate]
I have a dataframe with a column having multiple values ( comma separated ):mydf <- structure(list(Age = c(99L, 10L, 40L, 15L), Info = c("good, bad, sad", "nice, happy, joy", "NULL", "okay, nice,...
View ArticleR - ggplot2 different columns by filtering, vertically stacked
I have a dataframe (df2) with Age, Info, Target and also Info converted into one-hot-encoded columns as below.library(qdapTools)require(reshape)mydf <- structure(list(Age = c(99L, 10L, 40L, 15L),...
View ArticleWhy env does not print PS1 variable?
When we print the value of PS1, it is set:$ echo $PS1[\u@\h \W]\$We can use env command to print environment variables. Why does it not list PS1 variable ?$ env | grep PS1# No output here
View ArticleAnswer by tuxdna for Fast scala compiler unable to compile mutable TreeMap
Works just fine with Scala 2.12.1$ scala -versionScala code runner version 2.12.1 -- Copyright 2002-2016, LAMP/EPFL and Lightbend, Inc.$ cat > script.scalaobject Test {var returnData:...
View ArticleAnswer by tuxdna for Python - How to get print statement to print max number...
def countdownWhile(n, max_repeat): for i in range(max_repeat): for x in range(n,0,-1): print (x) print('blast off')RunIn [6]: countdownWhile(5,2)5432154321blast off
View ArticleAnswer by tuxdna for Difference in message-passing model of Akka and Vert.x
After doing a bit of google search I have figured that at detailed comparison of Akka vs Vert.x has not yet been done ( atleast I cound't find it ).Computation model:Vert.x is based on Event Driven...
View ArticleAnswer by tuxdna for How to remove the decorate colors characters in bash...
You have multiple...
View ArticleAnswer by tuxdna for Create a Breeze DenseMatrix from a List of double arrays...
You can try this:val matrix = DenseMatrix(data:_*)EDIT1For an explanation of how it works, you can consider data: _* as expansion into variable arguments. For example if val data =...
View ArticleAnswer by tuxdna for When should you use Array and when should you use...
The different between Array and ArrayBuffer boils down to the amortized cost of resizing the array storage.For exact details you can read this post:https://stackoverflow.com/a/31213983/1119997If you...
View ArticleAnswer by tuxdna for How to generate 4 digit random numbers in java from 0000...
This should work: Random r = new Random(); String randomNumber = String.format("%04d", r.nextInt(1001)); System.out.println(randomNumber);EDIT1 Random r = new Random(); String randomNumber =...
View ArticleAnswer by tuxdna for How does heapq.nsmallest work
heapq uses a a heap ( _heapify_max )Here is the implementation for heapq.nsmallest - https://github.com/python/cpython/blob/master/Lib/heapq.py#L395Also look...
View ArticleAnswer by tuxdna for How to define Tuple1 in Scala?
For tuple with cardinality 2 or more, you can use parentheses, however for with cardinality 1, you need to use Tuple1:scala> val tuple1 = Tuple1(1)tuple1: (Int,) = (1,)scala> val tuple2 = ('a',...
View ArticleAnswer by tuxdna for Dataframe to a nxn matrix
This works:In [85]: df2 = df.pivot(index="From", columns="To", values="Rates") In [86]: full_index = df2.index.union(df2.columns) In [87]: df2 = df2.reindex(labels=full_index,...
View ArticleAnswer by tuxdna for Remove all occurrences of a value from a list?
We can also do in-place remove all using either del or pop:import randomdef remove_values_from_list(lst, target): if type(lst) != list: return lst i = 0 while i < len(lst): if lst[i] == target:...
View ArticleRead top-level JSON dictionary incrementally using Python ijson
I have the following data in my JSON file:{"first": {"name": "James","age": 30 },"second": {"name": "Max","age": 30 },"third": {"name": "Norah","age": 30 },"fourth": {"name": "Sam","age": 30 }}I want...
View ArticleAnswer by tuxdna for Running executable files on Linux
Like any other program, a shell is also a program which is waiting for input. Now when you type in command1 arg1 arg2 ... , the first thing a shell does is to try to identify command1 from among the...
View ArticleCreating new Scala project using SBT?
I can create a sbt project like so:$ mkdir project1$ cd project1$ sbtLoading /usr/share/sbt/bin/sbt-launch-lib.bash> set name := "project1"[info] Defining *:name...> set scalaVersion...
View ArticleExplanation of output of Python tqdm.
I have a program in python that uses tqdm to output progress bar which shows like below: 0%| | 1/782 [00:02<31:00, 2.38s/it, loss=0.763 ] 17%|█▋ | 134/782 [00:19<01:21, 7.98it/s, loss=0.375...
View ArticleHow to create ArrayList (ArrayList) from array (int[]) in Java
I have seen the question: Create ArrayList from arrayHowever when I try that solution with following code, it doesn't quite work in all the cases:import java.util.ArrayList;import...
View ArticleHow to build a confirm() like function for NodeJS? [duplicate]
In the browser we have confirm() function which allows to provide a prompt string, and it returns either true/false.I am trying to create a similar function for a NodeJS cli app. Below I am using...
View Article