pdbio | Pandas-based Data Handler for VCF , BED , and SAM Files | Genomics library

by dceoy Python Version: 0.4.2 License: MIT

X-Ray Key Features Code Snippets Community Discussions(8)Vulnerabilities Install Support

kandi X-RAY | pdbio Summary

pdbio is a Python library typically used in Artificial Intelligence, Genomics applications. pdbio has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can install using 'pip install pdbio' or download it from GitHub, PyPI.

Pandas-based Data Handler for VCF, BED, and SAM Files.

Support

Quality

Security

License

Reuse

Support

pdbio has a low active ecosystem.

It has 13 star(s) with 0 fork(s). There are 2 watchers for this library.

It had no major release in the last 12 months.

pdbio has no issues reported. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of pdbio is 0.4.2

Quality

pdbio has no bugs reported.

Security

pdbio has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

pdbio is licensed under the MIT License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

pdbio releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions are not available. Examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed pdbio and discovered the below as its top functions. This is intended to give you an instant insight into pdbio implemented functionality, and help decide if they suit your requirements.

Identify variants in a FASTA file
Read a single chromosome from a FASTA file
Convert a variant to a dict
Open a file
Convert file to csv
Sort the dataframe
Sort a DataFrame by chromosome and position
Write header to file
Load a SAM file
Convert a list of lines into a pandas DataFrame
View files
Returns the executable for a given command
Convert a cigar string to match chrs
Convert a string to a chrs
Convert md5 to chrs string
Return a new Pandas DataFrame with consolidated tags
Creates a pandas dataframe from a DataFrame
Calculate the median depth
Load the table
Load data from file
Configure logging
Write the samline
Loads the table
Sort by chromosome
Load a csv file
Get the region depth

Get all kandi verified functions for this library.

pdbio Key Features

No Key Features are available at this moment for pdbio.

pdbio Examples and Code Snippets

No Code Snippets are available at this moment for pdbio.

Community Discussions

Trending Discussions on pdbio

Extracting chains from a electron microscopy structure

Converting segments of large .cif files to smaller .pdb files

How to save each ligand from a PDB file separately with Bio.PDB?

Extracting only the chains that we need from a PDB file

TypeError when creating PDB file using Biopython's PDBIO, only with certain files

How to separately get the X, Y or Z coordinates from a pdb file

How to move protein coordinates with respect to a reference frame

Replace residue numbers in pdb with given residue numbers list

QUESTION

Extracting chains from a electron microscopy structure

Asked 2020-Sep-24 at 17:12

I need to extract single chains from a structure file in cif format as available from the PDB. I've read several related questions, such as this and this. The proposed solution indeed works well if the chain ID is an integer or a single character. If applied to a structure such as 6KMW to extract chain aA it raises the error TypeError: %c requires int or char. Full code used to reproduce the error and output included below.

...

ANSWER

Answered 2020-Sep-24 at 17:12

I think, what you are trying to achieve is just impossible. Effectively you want to convert a cif file to a pdb file. It does not matter that you want to reduce the protein structure to a single chain in the process. The PDB format is a file format from the last century. (I know how widely spread it is till today...) It is column oriented and only allows for one character for the chain id. This is the reason you cannot download a PDB file for protein 6KMW. See the tooltip at https://www.rcsb.org/structure/6KMW for that: "PDB format files are not available for large structures". In your case "large" means, proteins with so many chains that they need two characters.

You cannot store two characters as the chain name for a PDB file. You got two options now:

Rename the chain "aA" and save the file in PDB format
Don't use the PDB format as your file format but stick to cif

This snippet renames the chain and stores the structure as a pdb file:

Source https://stackoverflow.com/questions/64000149

QUESTION

Converting segments of large .cif files to smaller .pdb files

Asked 2020-May-18 at 15:16

I'm trying to carve out some binding sites with ligands from cif-files of ribosome crystal structures, and have encountered an annoying problem involving a type error.

TypeError: %c requires int or char

Using the code below,

...

ANSWER

Answered 2020-May-18 at 15:16

The chain name format in _ATOM_FORMAT_STRING is %c, while in this case you have chain named QA.

Chain names in PDB files were traditionally single characters. But there are only so many letters and digits. For ribosome it's necessary to use longer names. The pdb format has space for a second letter -- empty column on the left from the 1-character chain name. Many programs support it, but not all, and this is not part of the official specification.

So you can either use PDB files with 2-character chains (if the rest of your workflow supports it) or rename chains in the output (your output is only a tiny part of the original structure).

Here is how to do it in gemmi:

Source https://stackoverflow.com/questions/60168883

QUESTION

How to save each ligand from a PDB file separately with Bio.PDB?

Asked 2020-May-18 at 13:06

I have a list of PDB files. I want to extract the ligands of all the files (so, heteroatoms) and save each one separately into PDB files, by using the Bio.PDB module from BioPython.

I tried some solutions, like this one: Remove heteroatoms from PDB , that I tried to adapt to keep the heteroatoms. But all I obtain is files with all the ligand in the same file.

I also tried a thing like this :

...

ANSWER

Answered 2020-Apr-23 at 19:38

You were quite close.

But you have to provide a Select class as second argument to io.save. Have a look at the doc comment. It says that this argument should provide accept_model, accept_chain, accept_residue and accept_atom.

I created a class ResidueSelect that inherits from Bio.PDB.PDBIO.Select. That way I only have to override the methods we need. In our case for chain and residues.

Because we only want to save the current residue in the current chain, I provide two respective arguments for the constructor.

Source https://stackoverflow.com/questions/61390035

QUESTION

Extracting only the chains that we need from a PDB file

Asked 2019-Aug-19 at 07:12

I need to extract specific chains from PDB files( Sometiems more than one chain). How to extract chains from a PDB file?. It's the same question and "marked" answer, answers my problem. But it does not work in python 3. It gives errors one after the other. Does anybody knows how can i work this in python 3?

Or any other code for the same kind of problem

Thank you in advance.

...

ANSWER

Answered 2019-Aug-19 at 07:12

retrieve_pdb_file has the optional parameter file_format. When no information is provided, the PDB server returns cif files. Biopython's parser expects a PDB file.

You can change the line to

Source https://stackoverflow.com/questions/57543344

QUESTION

TypeError when creating PDB file using Biopython's PDBIO, only with certain files

Asked 2018-May-29 at 12:46

I am writing a script that renumbers protein structures (CIF files) and then saves them (PDB files: Biopython does not have a CIF saving function).

For most of the files I use, it works. But for files like 6ek0.pdb, 5t2c.pdb, and 4v6x.pdb I keep getting the same TypeError for the same line of the io.save function. The error also is there when I do not renumber the file, only have input and output like this:

...

ANSWER

Answered 2018-May-29 at 11:15

The error is triggered when BioPython tries to write two-letter chain name using %c format in _ATOM_FORMAT_STRING.

More generally, big structures like 5T2C (ribosome) cannot be written in the traditional PDB format. Many programs and libraries support two-character chain names (written in columns 21-22), but the standard is to have a single-character chain name in column 22. Then you need some extension of atom numbering to support more than 99,999 atoms - the most popular one is hybrid-36.

Anyway, BioPython does not support big PDB files.

(if you write what exactly you want to do someone may be able to suggest another solution)

Source https://stackoverflow.com/questions/50579608

QUESTION

How to separately get the X, Y or Z coordinates from a pdb file

Asked 2018-Feb-14 at 17:52

I have a PDB file '1abz' (https://files.rcsb.org/view/1ABZ.pdb), which is containing the coordinates of a protein structure. Please ignore the lines of the header remarks, the interesting information starts at line 276 which says 'MODEL 1'.

I would like to separately get the X, Y or Z coordinates from a pdb file.

This link explains the column numbers of a pdb file: http://cupnet.net/pdb-format/

This is the code that I have but I got an error message.

...

ANSWER

Answered 2017-Dec-15 at 04:10

>>> Bio.__version__
'1.69'

Source https://stackoverflow.com/questions/47825542

QUESTION

How to move protein coordinates with respect to a reference frame

Asked 2017-Dec-14 at 19:53

I would like to shift the coordinates with respect to a reference frame at the origin (i.e x=0, y=0, z=0) and generate a new coordinate file.

I read through biopython tutorial (http://biopython.org/wiki/The_Biopython_Structural_Bioinformatics_FAQ), used the transform method of the Atom object (http://biopython.org/DIST/docs/api/Bio.PDB.Atom.Atom-class.html#transform), and came up with this script but no success.

How can I go about this? Many thanks in advance!

...

ANSWER

Answered 2017-Dec-14 at 19:53

In your last loop for atom in residue you are defining the function rotmat every time you loop over an atom but you never call the function.
Try removing the line def rotmat():
Currently both your rotation and your translation wouldn't change the atom coordinates.

If you want for example to define C1 as your reference point you could use the following code.

rotation_matrix is just a matrix which does not rotate your protein. np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) would do the same.

Source https://stackoverflow.com/questions/47806879

QUESTION

Replace residue numbers in pdb with given residue numbers list

Asked 2017-Jun-21 at 13:16

I have renumbered residue numbers list as new_residues=[18,19,20,21,22,34,35,36,37.... 130,131,132] and I would like to change my pdb residue numbers with this list. Do you have any idea to re-numbering ?

...

ANSWER

Answered 2017-Jun-21 at 13:16

In your example you are overwriting all the information about the residue, also the info about the amino acid in the particular position.

Let's increment all ids in our file by 200, loop through the models and structures and then use get_residues() in combination with enumerate to get all the residues and an index.

The residue.id is stored in a list and only the id is changed. This list is then converted back to a tuple and written in place of the original id.

Source https://stackoverflow.com/questions/44593855

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install pdbio

You can install using 'pip install pdbio' or download it from GitHub, PyPI.
You can use pdbio like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: