fq | F@#$*&%Q (Message queue that is fast, brokered, in C and gets out of your way) | Pub Sub library
kandi X-RAY | fq Summary
kandi X-RAY | fq Summary
fq is a brokered message queue using a publish subscribe model. It is architected for performance and isn't (today) designed for large numbers of connected clients.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of fq
fq Key Features
fq Examples and Code Snippets
Community Discussions
Trending Discussions on fq
QUESTION
I'm currently making a start on using Nextflow to develop a bioinformatics pipeline. Below, I've created a params.files
variable which contains my FASTQ files, and then input this into fasta_files
channel.
The process trimming
and its scripts takes this channel as the input, and then ideally, I would output all the $sample".trimmed.fq.gz
into the output channel, trimmed_channel
. However, when I run this script, I get the following error:
Missing output file(s) `trimmed_files` expected by process `trimming` (1)
The nextflow script I'm trying to run is:
...ANSWER
Answered 2022-Mar-09 at 06:14Nextflow does not export the variable trimmed_files
to its own scope unless you tell it to do so using the env
output qualifier, however doing it that way would not be very idiomatic.
Since you know the pattern of your output files ("FASTQ/*_trimmed.fq.gz"
), simply pass that pattern as output:
path "FASTQ/*_trimmed.fq.gz" into trimmed_channel
Some things you do, but probably want to avoid:
- Changing directory inside your NF process, don't do this, it entirely breaks the whole concept of nextflow's
/work
folder setup. - Write a bash loop inside a NF process, if you set up your channels correctly there should only be 1 task per spawned process.
QUESTION
I have received multiple fastq.gz files from Illumina Sequencing for 100 samples. But all the fastq.gz files for the respective samples are in separate folders according to the sample ID. Moreover, I have multiple (8-16) R1.fastq.gz
and R2.fastq.gz
files for one sample. So, I used the following code for concatenating all the R1.fastq.gz
and R2.fastq.gz
into a single R1.fastq.gz
and R2.fastq.gz
.
ANSWER
Answered 2022-Feb-24 at 09:21Based on the provided file structure, would you please try:
QUESTION
New here!
I'm trying to loop a python script in bash that gets stats from fastq files. I want it to loop through all the fastq files in a directory and save the outputs in a text file. Ideally don't want to edit the python script
This is the script that works when I'm not looping it:
...ANSWER
Answered 2022-Feb-09 at 14:08Like this if you want all results in a single file:
QUESTION
I have a tuple channel containing entries like:
...ANSWER
Answered 2022-Feb-09 at 09:44I tend to avoid using the each qualifier like this because of this recommendation in the docs:
If you need to repeat the execution of a process over n-tuple of elements instead a simple values or files, create a channel combining the input values as needed to trigger the process execution multiple times. In this regard, see the combine, cross and phase operators.
I don't actually think there's a way to join channels using a regex, but what you can do is use the combine operator to produce the Cartesian product of the items emitted by two channels. And if you supply the by
parameter, you can combine the items that share a common matching key. For example, untested:
QUESTION
I am trying to create a robust glob pattern that will match most of the different naming conventions used for fastq files we receive. However, the version of nextflow I am using (20.10.0) on the HPC doesn't seem to accept what I've written.
Here are some examples of file names:
...ANSWER
Answered 2022-Jan-31 at 14:08The following glob pattern seems to match some of the more common FASTQ filenames:
QUESTION
I've been trying to run a while-loop in parallel for a work that takes days.
I've seen other answers where parallel was implemented within a while-loop, but for that case it does work in blocks, where the next job only works after all previous jobs finished.
This is the code, which reproduced the two columns of the CSV file:
...ANSWER
Answered 2022-Jan-27 at 12:29cat <<_EOF > samples.csv
a2,b2
a3,b3
a4,b4
_EOF
cat samples.csv | parallel --colsep , echo column 1 = {1} column 2 = {2}
QUESTION
Hello I need to iterate over pairs of files and do something with them.
For example I have 4 files which are named AA2234_1.fastq.gz
AA2234_2.fastq.gz
AA3945_1.fastq.gz
AA3945_2.fastq.gz
As you can propably tell the pairs are AA2234_1.fastq.gz
<-> AA2234_2.fastq.gz
and AA3945_1.fastq.gz
<-> AA3945_2.fastq.gz
(they share the name before _
sign)
I have a command
with syntax looking like this:
initialize_of_command file1 file2 output_a output_b output_c output_d parameteres
I want this script to find the number of files with fastq.gz
extension in a directory, divide them by 2 to find number of pairs then match the pairs together using probably regex (maybe to two variables) and execute this command
for each pair once.
I have no idea how to pair up those files using regex and how to iterate over the pairs so the scripts knows through which pairs it have already iterated.
Here is my unfinished script:
...ANSWER
Answered 2022-Jan-08 at 02:50One way to iterate over pairs of arguments:
QUESTION
I have a list of file names in a file called data.tsv. Each row corresponds to the same sample ID and there is a potential of up to 8 files per ID. I need to merge files that end in "_1.[variable extension]" and "_2.[variable extension].
Data below, however stackover flow coverts tabs to spaces - should be tab delimited:
...ANSWER
Answered 2022-Jan-08 at 14:56Fixed all your script and added comments everywhere, so you can understand how it works in the details:
QUESTION
This is a follow-up of a previous question about using a Python
dictionary to generate a list of files to include as input for a single step. In this case, I'm interested in merging BAM files for a single sample that have been generated by mapping FASTQ files from multiple runs.
I am running into an error in my rule combine_bams
only for a single sample:
ANSWER
Answered 2022-Jan-04 at 03:10In rule combine_bams
, when using lambda
expression you will need to provide the values of all {}
wildcards. Right now there is only run
information provided. One way to fix this is to include kwarg allow_missing=True
to expand
:
QUESTION
I'm writing a Snakemake pipeline for scRNAseq sequence processing which uses STAR as the alignment tool.
Loading genome index into the memory for each alignment job is very time-consuming. Since I have a lot of memory at my disposal, I figure that it is feasible to use the "shared memory" module from STAR. With it I can load the genome into the memory and keep it there until all the jobs are done.
With shell this is very easy to achieve, for example here.
But chianing it in a Snakemake pipeline seems a non-trivial task, especially that the STAR "shared memory" module doesn't require any input or output anything. For example, I tried:
...ANSWER
Answered 2021-Nov-11 at 08:55the STAR "shared memory" module doesn't require any input or output anything
I'm not familiar with STAR and the shared memory option, but the issue about input/output can be easily resolved using dummy, flag files that signal a step is done. E.g.:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install fq
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page