multiSub | Can read GISAID | Genomics library
kandi X-RAY | multiSub Summary
kandi X-RAY | multiSub Summary
multiSub accepts input sequences in fasta format and meta data in tsv, csv or GISAID (xls or csv) formats. It will make some effort to clean the input data, e.g. skip missing sequences, strip flanking Ns, or remove empty meta data and output warnings if that happens. It can then create one or multiple output files, in NCBI, NCBI-tag, NCBI-ftp, ENA-xml or GISAID-csv format and directly upload to NCBI, ENA or GISAID. The script takes care of the different ways to format the virus names (for example, hCov-19 for GISAID, SARS-CoV-2 for NCBI), translates the different ways to specify the country, checks the date format and adds sequence IDs where needed. It does not support more than the date and isolate and country fields, but other fields can be easily added, just email examples to maxh@ucsc.edu. There is really only a single table and a single fasta file needed. The different export steps will pick out of the meta data table what they need. E.g. the field "Genome Coverage" will be exported by the NCBI Genbank step into a "structured-comment" field "Genome Coverage", and will also end up in the ENA fields "coverage" and GISAID's "covv_coverage". The meta table field names should either follow NCBI standards or be a GISAID file. As a matter of fact, there is an order to the steps: you first need to upload to NCBI Biosamples to obtain Biosamples accessions, then you re-convert to add these IDs to the files, then you can upload the new files to Genbank or SRA, with the Biosamples accessions in them. The examples below should make this clear. Many thanks to Stephan Fuchs and Kyanoush Yahosseini, Robert Koch Institut, Berlin, for sending me their Python ENA uploader code, from which I copied. Also thanks to the ENA Helpdesk and the NCBI Helpdesk for their quick replies. Also to Kelsey Florek and Ethan Wang for bug reports. The NCBI bulk upload draws heavily from examples provided by Danny Park at the Broad Institute. Without all of these people, this program would not have been possible.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of multiSub
multiSub Key Features
multiSub Examples and Code Snippets
Community Discussions
Trending Discussions on multiSub
QUESTION
Initially having working script like this to go over the csv files in the folder and substitute a sub-string:
...ANSWER
Answered 2020-Jul-08 at 21:44Your code actually works for me as is when I test it, but you have a lot of unnecessary processing in there that may be introducing errors. The big advantage of using fileinput
over regular open
is that it can loop through lines in multiple files without needing another loop to open each file individually. So try this and see if it works:
QUESTION
I need to write a function to carry out a number of substitutions into a string, based on looking up values in the sheet.
My intention is to iterate over a list of substitution pairs in the sheet and calling the workbook function 'substitute' for each iteration.
...ANSWER
Answered 2019-Jan-08 at 00:20I think you want Cells not Offset as Offset will return a range the same size as the parent range.
QUESTION
I have text like
...ANSWER
Answered 2018-Mar-07 at 21:34For your problem(s) it's better to capture the text you DO want and replace the whole line with that. This captures the data you are interested in and allows you to rebuild it however you'd like (in the replace line):
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install multiSub
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page