hgvs | HGVS variant name parsing and generation | Genomics library

by counsyl Python Version: v0.9.2 License: MIT

X-Ray Key Features Code Snippets Community Discussions(4)Vulnerabilities Install Support

kandi X-RAY | hgvs Summary

hgvs is a Python library typically used in Artificial Intelligence, Genomics applications. hgvs has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can download it from GitHub.

In most next-generation sequencing applications, variants are first discovered and described in terms of their genomic coordinates such as chromosome 7, position 117,199,563 with reference allele G and alternative allele T. According to the HGVS standard, we can describe this variant as NC_000007.13:g.117199563G>T. The first part of the name is a RefSeq ID NC_000007.13 for chromosome 7 version 13. The g. denotes that this is a variant described in genomic (i.e. chromosomal) coordinates. Lastly, the chromosomal position, reference allele, and alternative allele are indicated. For simple single nucleotide changes the > character is used. More commonly, a variant will be described using a cDNA or protein style HGVS name. In the example above, the variant in cDNA style is named NM_000492.3:c.1438G>T. Here again, the first part of the name refers to a RefSeq sequence, this time mRNA transcript NM_000492 version 3. Optionally, the gene name can also be given as NM_000492.3(CFTR). The c. indicates that this is a cDNA name, and the coordinate indicates that this mutation occurs at position 1438 along the coding portion of the spliced transcript (i.e. position 1 is the first base of ATG translation start codon). Briefly, the protein style of the variant name is NP_000483.3:p.Gly480Cys which indicates the change in amino-acid coordinates (480) along an amino-acid sequence (NP_000483.3) and gives the reference and alternative amino-acid alleles (Gly and Cys, respectively). The standard also specifies custom name formats for many mutation categories such as insertions (NM_000492.3:c.1438_1439insA), deletions (NM_000492.3:c.1438_1440delGGT), duplications (NM_000492.3:c.1438_1440dupGGT), and several other more complex genomic rearrangements. While many of these names appear to be simple to parse or generate, there are many corner cases, especially with cDNA HGVS names. For example, variants before the start codon should have negative cDNA coordinates (NM_000492.3:c.-4G>C), and variants after the stop codon also have their own format (NM_000492.3:c.*33C>T). Variants within introns are indicated by the closest exonic base with an additional genomic offset such as NM_000492.3:4243-20A>G (the variant is 20 bases in the 5' direction of the cDNA coordinate 4243). Lastly, all coordinates and alleles are specified on the strand of the transcript. This library properly handles all logic necessary to convert genomic coordinates to and from HGVS cDNA coordinates. Another important consideration of any library that handles HGVS names is variant normalization. The HGVS standard aims to provide "uniform and unequivocal" description of variants. Namely, two people discovering a variant should be able to arrive at the same name for it. Such a property is very useful for checking whether a variant has been seen before and connecting all known relevant information. For SNPs, this property is fairly easy to achieve. However, for insertions and deletions (indels) near repetitive regions, many indels are equivalent (e.g. it doesn’t matter which AT in a run of ATATATAT was deleted). The VCF file format has chosen to uniquely specify such indels by using the most left-aligned genomic coordinate. Therefore, compliant variant callers that output VCF will have applied this normalization. The HGVS standard also specifies a normalization for such indels. However, it states that indels should use the most 3' position in a transcript. For genes on the positive strand, this is the opposite direction specified by VCF. This library properly implements both kinds of variant normalization and allows easy conversion between HGVS and VCF style variants. It also handles many other cases of normalization (e.g. the HGVS standard recommends indicating an insertion with the dup notation instead of ins if it can be represented as a tandem duplication).

Support

Quality

Security

License

Reuse

Support

hgvs has a low active ecosystem.

It has 148 star(s) with 70 fork(s). There are 37 watchers for this library.

It had no major release in the last 12 months.

There are 14 open issues and 20 have been closed. On average issues are closed in 204 days. There are 4 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of hgvs is v0.9.2

Quality

hgvs has 0 bugs and 24 code smells.

Security

hgvs has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

hgvs code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

hgvs is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

hgvs releases are available to install and integrate.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

hgvs saves you 892 person hours of effort in developing the same functionality from scratch.

It has 2039 lines of code, 113 functions and 11 files.

It has high code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed hgvs and discovered the below as its top functions. This is intended to give you an instant insight into hgvs implemented functionality, and help decide if they suit your requirements.

Align the sequence to the right strand
Replace indel
Performs a justification of an indel
Get a sequence from a chromosome
Pad sequences with 1 - prime bases
Parse name
Parse cdna
Parse an allele
Validate the coordinates
Read transcripts from a refgene file
Parse a refgene file
Create a transcript from a transcript
Flip the strand
Create a sequence from a position
Get the allele for a given transcript
Convert CDN to genomic coordinate
Return the coordinates of this HGVS name
Find the codon in the given list of exons
Returns True iff the reference sequence matches the reference sequence
Get all exons in a transcript
Return a BED6Interval object

Get all kandi verified functions for this library.

hgvs Key Features

No Key Features are available at this moment for hgvs.

hgvs Examples and Code Snippets

No Code Snippets are available at this moment for hgvs.

Community Discussions

Trending Discussions on hgvs

getting different patterns within the same string

How can I separate the information in one column of the table I have into 3 separate columns?

How can I separate 3 different information in a column?

How to append list to list using python?

QUESTION

getting different patterns within the same string

Asked 2022-Mar-03 at 10:21

I've got the following data frame:

...

ANSWER

Answered 2022-Mar-03 at 10:21

Here is a potential solution:

Source https://stackoverflow.com/questions/71334906

QUESTION

How can I separate the information in one column of the table I have into 3 separate columns?

Asked 2022-Feb-13 at 21:16

For example, one column of the table I have is like this

...

ANSWER

Answered 2022-Feb-13 at 21:05

Code

Here is a base R way with substr.

Source https://stackoverflow.com/questions/71104772

QUESTION

How can I separate 3 different information in a column?

Asked 2022-Feb-12 at 16:58

For example, in the column I have, there is a line written 'Ser25Phe'. And I want to split the column HGVS.Consequence e.g. as 'Ser 25 Phe'...

...

ANSWER

Answered 2022-Feb-12 at 11:29

Using gsub, assuming that e.g. "AsAsp" should also be split into "As Asp".

Source https://stackoverflow.com/questions/71089487

QUESTION

How to append list to list using python?

Asked 2021-Nov-24 at 07:20

I wonder to append listB to listA

input

...

ANSWER

Answered 2021-Nov-24 at 06:07

listA.extend(listB) should work, but it modifies the original listA. If you want sum_list to be a different list, you can copy it first with something like

Source https://stackoverflow.com/questions/70091384

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install hgvs

This library can be installed using the setup.py file as follows:.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: