datamash | Send questions
kandi X-RAY | datamash Summary
kandi X-RAY | datamash Summary
GNU Datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files. it is designed to be portable and reliable, and aid researchers to easily automate analysis pipelines, without writing code or even short scripts.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of datamash
datamash Key Features
datamash Examples and Code Snippets
Community Discussions
Trending Discussions on datamash
QUESTION
I have just discovered the tool datamash
to swap between transposition of rows/columns or columns/rows.
How can I define a shortcut in ~/.vimrc
to automatically transpose in the 2 senses ?
I tried to set (delimiter is whitespace) : noremap :% !datamash transpose -W
I don't know how to specify the current opened file on which I want to toggle between the rows/columns ans inversely columns/rows.
Anyone could see how to perform this transposition by a simple shortcut on vim
?
EDIT: everyting works fine by setting into ~/.vimrc:
...ANSWER
Answered 2020-Sep-04 at 21:10Try including by the end of the
noremap
line.
QUESTION
I am using Datamash 1.7 on Centos 7.7 Linux x86_64 machine to sort and bin data which is 24 GB in size. Input data looks as follows (only first 50 samples)
...ANSWER
Answered 2020-Jul-18 at 21:21Looking at the source (Since unfortunately binning isn't described very well in the documentation), numeric binning is done by this code:
QUESTION
I have a large text file with 100000 rows and columns as like this
...ANSWER
Answered 2020-Apr-28 at 20:05Please use df.melt and drop the variable column
QUESTION
NEED: I have a file containing data like the sample below. I need to:
- Merge all lines into one line when X number of fields match
- Create range of values in X number of fields when the values vary
In this case: Merge all lines into one where fields $1 through $6 and $8 match, create a range of the associated values in field $7
OR
Merge all lines into one where fields $1 through $7 match, create a range of the associated values in field $8*
- The solution has to be something native to Linux (like a bash or awk script) that doesn't require installation of additional software (e.g. datamash).
@rtx13's solution below using TCL does work (thanks again), I'm just not sure if I can install TCL in my live environment so I hope and AWK/BASH/etc. solution can also be proposed.
Original data:
...ANSWER
Answered 2020-Mar-31 at 03:14The following works on one column at a time.
mergecolumn
script:
QUESTION
I need to produce a sliding window of millions of lines and to calculate the median of column 3. My data looks like this with column 1 always being the same, column 2 equaling the line number and column 3 being the information that I need the median for:
...ANSWER
Answered 2020-Mar-24 at 14:16The following script with GNU awk seems to generate the output you presented:
QUESTION
Im trying to "convert" the following file from multiple rows into separated column.
classr#94 mesur#237 high#228 cash#232
classr#118 mesur#332 high#430 cash#421 Sar#380
classr#57 mesur#89 hight#65
My desired output:
classr#94
mesur#237
high#228
cash#232
classr#118
mesur#332
high#430
cash#421
Sar#380
classr#57
mesur#89
hight#65
I tried datamash -t: transpose < Filename but converted my file in very "weird" way
I also tried grep -o # File_name but i got only the #.
I think in the grep case if I find the way to get the entire word I will obtain the desired output.
...ANSWER
Answered 2020-Jan-22 at 14:29cat filetoconvert | tr " " "\n"
QUESTION
I have a directed graph with like 2000 nodes stored in a file. Each line represents an edge from the node stored in the first column to the node stored in the second column, it is even easy to visualize the data for example in dot(1). Columns are separated by tabs, rows separated by newlines and nodes are named with any of the a-zA-Z0-9_
characters. Tree can have multiple roots, it may have cycles, which should be ignored. I don't care about cycles, they are redundant, but they can happen in the input. Below I am presenting an example of the graph, with tr
to substitute spaces for tabs and here-document, to easy reproduce the input file:
ANSWER
Answered 2019-Aug-28 at 19:23Since no one has posted an answer yet, here is an awk solution as a starting point:
QUESTION
Consider the following:
...ANSWER
Answered 2017-Nov-25 at 00:19You need to concatenate the packages into a single string (ordered by package) for each customer, then you can count by that concatenated string:
For Postgres:
QUESTION
I have tried several awk and sed commands and GNU datamash to change the format and code the missing fields as "??" of this data file with no success. I have a file with a format that looks like the following:
...ANSWER
Answered 2017-Nov-07 at 00:41awk
to the rescue!
with true multidimensional arrays it would be easier, but this works for most awk
s
QUESTION
I have a system command which i am trying to execute, but it gives me error "Syntax error: redirection unexpected"
Trying command:
...ANSWER
Answered 2017-Aug-04 at 16:49Backticks aka readpipe
expect a command passed to sh
(or cmd
in Windows).
You appear to have a bash
command rather than a sh
command. Fixed:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install datamash
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page