pdbio | Pandas-based Data Handler for VCF , BED , and SAM Files | Genomics library

 by   dceoy Python Version: 0.4.2 License: MIT

kandi X-RAY | pdbio Summary

kandi X-RAY | pdbio Summary

pdbio is a Python library typically used in Artificial Intelligence, Genomics applications. pdbio has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can install using 'pip install pdbio' or download it from GitHub, PyPI.

Pandas-based Data Handler for VCF, BED, and SAM Files.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              pdbio has a low active ecosystem.
              It has 13 star(s) with 0 fork(s). There are 2 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              pdbio has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of pdbio is 0.4.2

            kandi-Quality Quality

              pdbio has no bugs reported.

            kandi-Security Security

              pdbio has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              pdbio is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              pdbio releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi has reviewed pdbio and discovered the below as its top functions. This is intended to give you an instant insight into pdbio implemented functionality, and help decide if they suit your requirements.
            • Identify variants in a FASTA file
            • Read a single chromosome from a FASTA file
            • Convert a variant to a dict
            • Open a file
            • Convert file to csv
            • Sort the dataframe
            • Sort a DataFrame by chromosome and position
            • Write header to file
            • Load a SAM file
            • Convert a list of lines into a pandas DataFrame
            • View files
            • Returns the executable for a given command
            • Convert a cigar string to match chrs
            • Convert a string to a chrs
            • Convert md5 to chrs string
            • Return a new Pandas DataFrame with consolidated tags
            • Creates a pandas dataframe from a DataFrame
            • Calculate the median depth
            • Load the table
            • Load data from file
            • Configure logging
            • Write the samline
            • Loads the table
            • Sort by chromosome
            • Load a csv file
            • Get the region depth
            Get all kandi verified functions for this library.

            pdbio Key Features

            No Key Features are available at this moment for pdbio.

            pdbio Examples and Code Snippets

            No Code Snippets are available at this moment for pdbio.

            Community Discussions

            QUESTION

            Extracting chains from a electron microscopy structure
            Asked 2020-Sep-24 at 17:12

            I need to extract single chains from a structure file in cif format as available from the PDB. I've read several related questions, such as this and this. The proposed solution indeed works well if the chain ID is an integer or a single character. If applied to a structure such as 6KMW to extract chain aA it raises the error TypeError: %c requires int or char. Full code used to reproduce the error and output included below.

            ...

            ANSWER

            Answered 2020-Sep-24 at 17:12

            I think, what you are trying to achieve is just impossible. Effectively you want to convert a cif file to a pdb file. It does not matter that you want to reduce the protein structure to a single chain in the process. The PDB format is a file format from the last century. (I know how widely spread it is till today...) It is column oriented and only allows for one character for the chain id. This is the reason you cannot download a PDB file for protein 6KMW. See the tooltip at https://www.rcsb.org/structure/6KMW for that: "PDB format files are not available for large structures". In your case "large" means, proteins with so many chains that they need two characters.

            You cannot store two characters as the chain name for a PDB file. You got two options now:

            • Rename the chain "aA" and save the file in PDB format
            • Don't use the PDB format as your file format but stick to cif

            This snippet renames the chain and stores the structure as a pdb file:

            Source https://stackoverflow.com/questions/64000149

            QUESTION

            Converting segments of large .cif files to smaller .pdb files
            Asked 2020-May-18 at 15:16

            I'm trying to carve out some binding sites with ligands from cif-files of ribosome crystal structures, and have encountered an annoying problem involving a type error.

            TypeError: %c requires int or char

            Using the code below,

            ...

            ANSWER

            Answered 2020-May-18 at 15:16

            The chain name format in _ATOM_FORMAT_STRING is %c, while in this case you have chain named QA.

            Chain names in PDB files were traditionally single characters. But there are only so many letters and digits. For ribosome it's necessary to use longer names. The pdb format has space for a second letter -- empty column on the left from the 1-character chain name. Many programs support it, but not all, and this is not part of the official specification.

            So you can either use PDB files with 2-character chains (if the rest of your workflow supports it) or rename chains in the output (your output is only a tiny part of the original structure).

            Here is how to do it in gemmi:

            Source https://stackoverflow.com/questions/60168883

            QUESTION

            How to save each ligand from a PDB file separately with Bio.PDB?
            Asked 2020-May-18 at 13:06

            I have a list of PDB files. I want to extract the ligands of all the files (so, heteroatoms) and save each one separately into PDB files, by using the Bio.PDB module from BioPython.

            I tried some solutions, like this one: Remove heteroatoms from PDB , that I tried to adapt to keep the heteroatoms. But all I obtain is files with all the ligand in the same file.

            I also tried a thing like this :

            ...

            ANSWER

            Answered 2020-Apr-23 at 19:38

            You were quite close.

            But you have to provide a Select class as second argument to io.save. Have a look at the doc comment. It says that this argument should provide accept_model, accept_chain, accept_residue and accept_atom.

            I created a class ResidueSelect that inherits from Bio.PDB.PDBIO.Select. That way I only have to override the methods we need. In our case for chain and residues.

            Because we only want to save the current residue in the current chain, I provide two respective arguments for the constructor.

            Source https://stackoverflow.com/questions/61390035

            QUESTION

            Extracting only the chains that we need from a PDB file
            Asked 2019-Aug-19 at 07:12

            I need to extract specific chains from PDB files( Sometiems more than one chain). How to extract chains from a PDB file?. It's the same question and "marked" answer, answers my problem. But it does not work in python 3. It gives errors one after the other. Does anybody knows how can i work this in python 3?

            Or any other code for the same kind of problem

            Thank you in advance.

            ...

            ANSWER

            Answered 2019-Aug-19 at 07:12

            retrieve_pdb_file has the optional parameter file_format. When no information is provided, the PDB server returns cif files. Biopython's parser expects a PDB file.

            You can change the line to

            Source https://stackoverflow.com/questions/57543344

            QUESTION

            TypeError when creating PDB file using Biopython's PDBIO, only with certain files
            Asked 2018-May-29 at 12:46

            I am writing a script that renumbers protein structures (CIF files) and then saves them (PDB files: Biopython does not have a CIF saving function).

            For most of the files I use, it works. But for files like 6ek0.pdb, 5t2c.pdb, and 4v6x.pdb I keep getting the same TypeError for the same line of the io.save function. The error also is there when I do not renumber the file, only have input and output like this:

            ...

            ANSWER

            Answered 2018-May-29 at 11:15

            The error is triggered when BioPython tries to write two-letter chain name using %c format in _ATOM_FORMAT_STRING.

            More generally, big structures like 5T2C (ribosome) cannot be written in the traditional PDB format. Many programs and libraries support two-character chain names (written in columns 21-22), but the standard is to have a single-character chain name in column 22. Then you need some extension of atom numbering to support more than 99,999 atoms - the most popular one is hybrid-36.

            Anyway, BioPython does not support big PDB files.

            (if you write what exactly you want to do someone may be able to suggest another solution)

            Source https://stackoverflow.com/questions/50579608

            QUESTION

            How to separately get the X, Y or Z coordinates from a pdb file
            Asked 2018-Feb-14 at 17:52

            I have a PDB file '1abz' (https://files.rcsb.org/view/1ABZ.pdb), which is containing the coordinates of a protein structure. Please ignore the lines of the header remarks, the interesting information starts at line 276 which says 'MODEL 1'.

            I would like to separately get the X, Y or Z coordinates from a pdb file.

            This link explains the column numbers of a pdb file: http://cupnet.net/pdb-format/

            This is the code that I have but I got an error message.

            ...

            ANSWER

            Answered 2017-Dec-15 at 04:10
            >>> Bio.__version__
            '1.69'
            

            Source https://stackoverflow.com/questions/47825542

            QUESTION

            How to move protein coordinates with respect to a reference frame
            Asked 2017-Dec-14 at 19:53

            I have a PDB file '1abz' (https://files.rcsb.org/view/1ABZ.pdb), which is containing the coordinates of a protein structure. Please ignore the lines of the header remarks, the interesting information starts at line 276 which says 'MODEL 1'.

            I would like to shift the coordinates with respect to a reference frame at the origin (i.e x=0, y=0, z=0) and generate a new coordinate file.

            I read through biopython tutorial (http://biopython.org/wiki/The_Biopython_Structural_Bioinformatics_FAQ), used the transform method of the Atom object (http://biopython.org/DIST/docs/api/Bio.PDB.Atom.Atom-class.html#transform), and came up with this script but no success.

            How can I go about this? Many thanks in advance!

            ...

            ANSWER

            Answered 2017-Dec-14 at 19:53
            • In your last loop for atom in residue you are defining the function rotmat every time you loop over an atom but you never call the function.
            • Try removing the line def rotmat():
            • Currently both your rotation and your translation wouldn't change the atom coordinates.

            If you want for example to define C1 as your reference point you could use the following code.

            rotation_matrix is just a matrix which does not rotate your protein. np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) would do the same.

            Source https://stackoverflow.com/questions/47806879

            QUESTION

            Replace residue numbers in pdb with given residue numbers list
            Asked 2017-Jun-21 at 13:16

            I have renumbered residue numbers list as new_residues=[18,19,20,21,22,34,35,36,37.... 130,131,132] and I would like to change my pdb residue numbers with this list. Do you have any idea to re-numbering ?

            ...

            ...

            ANSWER

            Answered 2017-Jun-21 at 13:16

            In your example you are overwriting all the information about the residue, also the info about the amino acid in the particular position.

            Let's increment all ids in our file by 200, loop through the models and structures and then use get_residues() in combination with enumerate to get all the residues and an index.

            The residue.id is stored in a list and only the id is changed. This list is then converted back to a tuple and written in place of the original id.

            Source https://stackoverflow.com/questions/44593855

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install pdbio

            You can install using 'pip install pdbio' or download it from GitHub, PyPI.
            You can use pdbio like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install pdbio

          • CLONE
          • HTTPS

            https://github.com/dceoy/pdbio.git

          • CLI

            gh repo clone dceoy/pdbio

          • sshUrl

            git@github.com:dceoy/pdbio.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link