PathOS | clinical application for filtering | Genomics library

by PapenfussLab Java Version: v1.5.3 License: GPL-3.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | PathOS Summary

PathOS is a Java library typically used in Healthcare, Pharma, Life Sciences, Artificial Intelligence, Genomics applications. PathOS has no bugs, it has no vulnerabilities, it has a Strong Copyleft License and it has low support. However PathOS build file is not available. You can download it from GitHub.

Clinical diagnostics is being transformed by DNA sequencing technology capable of analysing patient samples at the nucleotide level. Translating the data from this technology into clinically useful information requires decision support software that can analyse data from sequencers and allow clinical scientists to interpret the DNA variations. High throughput sequencing generates many technical artefacts from the chemical processing in the sequencing, these artefacts must be identified and filtered out of the data before further analysis. The curation process requires identifying and annotating DNA changes, SNPs (single nucleotide polymorphisms), indels (insertions and deletions), CNVs (copy number variants) and SVs (structural variants) within a sample of patient DNA (either blood or tumour). Once annotated, mutations are matched with internal and external databases to identify known pathogenic (disease causing) or actionable mutations (variants with an appropriate drug). The resulting few variants are then rendered into a clinical diagnostic report suitable for the treating clinician incorporating clinical evidence and relevant publications. PathOS carries out these tasks within a clinical laboratory setting where many patients must be reported on in a reliable, consistent and efficient manner.

Support

Quality

Security

License

Reuse

Support

PathOS has a low active ecosystem.

It has 23 star(s) with 8 fork(s). There are 6 watchers for this library.

It had no major release in the last 12 months.

There are 5 open issues and 6 have been closed. On average issues are closed in 156 days. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of PathOS is v1.5.3

Quality

PathOS has no bugs reported.

Security

PathOS has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

PathOS is licensed under the GPL-3.0 License. This license is Strong Copyleft.

Strong Copyleft licenses enforce sharing, and you can use them when creating open source projects.

Reuse

PathOS releases are available to install and integrate.

PathOS has no build file. You will be need to create the build yourself to build the component from source.

Installation instructions are available. Examples and code snippets are not available.

Top functions reviewed by kandi - BETA

kandi has reviewed PathOS and discovered the below as its top functions. This is intended to give you an instant insight into PathOS implemented functionality, and help decide if they suit your requirements.

Init the superclass application
Initialization method
Returns the list of child names for the given structure
Initialize the diagnostic
Initialize the factory
Initialize the Service
Initialize
Init this class
This method initialises the Builder
Initialize the object
Initializes the model class
Initialize the model
Initialize this class
Initialize Method
Initialize the class
Initialization
Inits this class
Initialize the Model
Initialize this factory
Initialize this model

Get all kandi verified functions for this library.

PathOS Key Features

No Key Features are available at this moment for PathOS.

PathOS Examples and Code Snippets

No Code Snippets are available at this moment for PathOS.

Community Discussions

Trending Discussions on PathOS

How to import script that requires __name__ == "__main__"

Python multiprocessing pool creating duplicate lists

Parallel writing in MySQL. How to write streaming entries to 100 different tables in MySQL database using Python mysql.connector

Error from running tensorflow models in parallel, when sequentially it works fine

Calling multiprocessing pool within a function is very slow

Using shared list with pathos multiprocessing raises `digest sent was rejected` error

Trouble installing turbodbc

Using Pygame with parallelism in python

How to efficiently share dicts and lists between processes using ProcessPool

What is the canonical way to use locking with `pathos.pools.ProcessPool`?

QUESTION

How to import script that requires __name__ == "__main__"

Asked 2021-Jun-08 at 22:10

I'm pretty new to Python, this question probably shows that. I'm working on multiprocessing part of my script, couldn't find a definitive answer to my problem.

I'm struggling with one thing. When using multiprocessing, part of the code has to be guarded with if __name__ == "__main__". I get that, my pool is working great. But I would love to import that whole script (making it a one big function that returns an argument would be the best). And here is the problem. First, how can I import something if part of it will only run when launched from the main/source file because of that guard? Secondly, if I manage to work it out and the whole script will be in one big function, pickle can't handle that, will use of "multiprocessing on dill" or "pathos" fix it?

Thanks!

...

ANSWER

Answered 2021-Jun-08 at 22:10

You are probably confused with the concept. The if __name__ == "__main__" guard in Python exists exactly in order for it to be possible for all Python files to be importable.

Without the guard, a file, once imported, would have the same behavior as if it were the "root" program - and it would require a lot of boyler plate and inter-process comunication (like writting a "PID" file at a fixed filesystem location) to coordinate imports of the same code, including for multiprocessing.

Just leave under the guard whatever code needs to run for the root process. Everything else you move into functions that you can call from the importing code.

If you'd run "all" the script, even the part setting up the multiprocessing workers would run, and any simple job would create more workers exponentially until all machine resources were taken (i.e.: it would crash hard and fast, potentially taking the machine to an unresponsive state).

So, this is a good pattern - th "dothejob" function can call all other functions you need, so you just need to import and call it, either from a master process, or from any other project importing your file as a Python module.

Source https://stackoverflow.com/questions/67895094

QUESTION

Python multiprocessing pool creating duplicate lists

Asked 2021-May-23 at 18:42

I'm trying to figure out multiprocessing and I've run into something I entirely don't understand.

I'm using pathos.multiprocessing for better pickling. The following code creates a list of objects which I want to iterate through. However, when I run it, it prints several different lists despite referring to the same variable?

...

ANSWER

Answered 2021-May-23 at 18:42

When using multiprocessing, the library spawns multiple different processes. Each process has its own address space. This means that each of those processes has its own copy of the variable, and any change in one process will not reflect in other processes.

In order to use shared memory, you need special constructs to define your global variables. For pathos.multiprocessing, from this comment, it seems you can declare multiprocessing type shared variables by simply importing the following:

Source https://stackoverflow.com/questions/67663109

QUESTION

Parallel writing in MySQL. How to write streaming entries to 100 different tables in MySQL database using Python mysql.connector

Asked 2021-May-05 at 16:04

I am wondering how I could write streaming data to different MySQL tables in a parallel way?

I have the following code: where the GetStreaming() returns a list of tuple [(tbName,data1,data2),(tbName,data1,data2),...] available at the time of call.

...

ANSWER

Answered 2021-May-05 at 16:04

Each "parallel" insertion process needs its own connector and cursor. You can't share them across any sort of thread.

You can use connection pooling to make faster the allocation and release of connections.

There's no magic in MySQL (or any DBMS costing less than the GDP of a small country) that lets it scale up to handle large scale data insertion on ~100 connections simultaneously. Paradoxically, more connections can have lower throughput than fewer connections, because of contention between them. You may want to rethink your system architecture so you can make it work OK with a few connections.

In other words: fewer bigger tables perform much better than many small tables.

Finally, read about ways of speeding up bulk inserts. For example this sort of multirow insert

Source https://stackoverflow.com/questions/67403752

QUESTION

Error from running tensorflow models in parallel, when sequentially it works fine

Asked 2021-Mar-29 at 22:15

Trying to use multiple TensorFlow models in parallel using pathos.multiprocessing.Pool

Error is:

...

ANSWER

Answered 2021-Mar-29 at 12:13

I'm the author of pathos. Whenever you see self._value in the error, what's generally happening is that something you tried to send to another processor failed to serialize. The error and traceback is a bit obtuse, admittedly. However, what you can do is check the serialization with dill, and determine if you need to use one of the serialization variants (like dill.settings['trace'] = True), or whether you need to restructure your code slightly to better accommodate serialization. If the class you are working with is something you can edit, then an easy thing to do is to add a __reduce__ method, or similar, to aid serialization.

Source https://stackoverflow.com/questions/66844587

QUESTION

Calling multiprocessing pool within a function is very slow

Asked 2021-Mar-25 at 23:09

I am trying to use pathos for triggering multiprocessing within a function. I notice, however, an odd behaviour and don't know why:

...

ANSWER

Answered 2021-Mar-25 at 23:09

Instead of from pathos.multiprocessing import ProcessPool as Pool, I used from multiprocess import Pool, which is essentially the same thing. Then I tried some alternative approaches.

So:

Source https://stackoverflow.com/questions/66788163

QUESTION

Using shared list with pathos multiprocessing raises `digest sent was rejected` error

Asked 2021-Feb-28 at 12:46

I am attempting to use multiprocessing for the generation of complex, unpickable, objects as per the following code snippet:

...

ANSWER

Answered 2021-Feb-28 at 12:46

So I have resolved this issue. I would still be great if someone like mmckerns or someone else with more knowledge than me on multiprocessing could comment on why this is a solution.

The issue seemed to have been that the Manager().list() was declared in __init__. The following code works without any issues:

Source https://stackoverflow.com/questions/66400064

QUESTION

Trouble installing turbodbc

Asked 2021-Jan-11 at 20:49

I am attempting to install turbodbc on my Ubuntu 20.10 machine.
My specs are as follows: pip 20.2.4, Python 3.8.5 , gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0

I have attempted the solutions provided in the previous posts here and and here.

I am getting this error message

...

ANSWER

Answered 2021-Jan-11 at 20:49

Boost is not installed. You can try this

Source https://stackoverflow.com/questions/65674126

QUESTION

Using Pygame with parallelism in python

Asked 2020-Oct-11 at 22:51

I am trying to train a neural network to play a SMB1 game made using Pygame. To get some speed up, I would like to use parallel processing in order to play multiple copies of the game at once, by different members of the population (and on different training data).

The root of my problem comes from the fact that Pygame isn't inherently instance-based; that is, it will only ever generate one window, with one display object. Because I can't create multiple Pygame windows and display objects for each process, the processes have to share a display object. This leads me to my first questions: Is there a way to have multiple instances of pygame, and if not, is there a (performance-light) method of concurrently drawing onto the display? a.k.a. each game draws to a different section of the whole window.

However, I don't really need every game to be rendered; I only care that at least one game instance is rendered, so that I can monitor its progress. My solution was then to assign each game a process id, and only the game with process id 0 would actually draw to the display. Concurrency problems solved! To accomplish this, I used multiprocessing.Process:

...

ANSWER

Answered 2020-Oct-11 at 22:51

I'll try to attack the main issues. My understanding of your actual problem is quite limited as I don't know what your code actually does.

"a way to use pygame as multiple different instances in different threads, spawned from the same process"

This doesn't work as pygame is built on SDL2 which states "You should not expect to be able to create a window, render, or receive events on any thread other than the main one."

"a way to safely concurrently work with pygame's display (and update clock)"

Same as above, the display only work in the main thread.

"a way to use multiprocessing.Process such that it doesn't need to pickle the method's class but can still access the class variables"

You could pickle the methods using something like dill, but it fells (to me) wrong to copy full on python object between processes. I'd go for another solution.

"a multiprocessing library that:" 1. Either doesn't need to pickle the lambda functions or is able to

You need to use to serialize Python objects in order to send them between processes.

2. Has a way to tell the subprocess which process worker is being used

I don't understand what this mean.

It seems to me that the problem could be solved with better separation of data and visualization. The training should have no knowledge about any visualization, as it's not dependent on how you want to display it. So there should not be any reason to share the pygame display.

Once this is done, it shouldn't be too much problem (multi-threading always causes problems) to do what you're trying to do. Regarding the pickle issue; I'd try to avoid pickling Python objects and functions, and instead just pass basic primitives between threads and processes. It seems like you should be able to assign self.fitnessFromArray with a simple int instead and based on its value do the min/avg/max calculation in the thread/process instead.

If you want to do threading, then the main thread will take care of the rendering. It'll also spawn threads for the training. When the threads are completed they'll return their result (or put it in a thread safe storage) and the main thread will poll the data and display the result. If the work done by the training takes longer than one frame, then divide up the work so each thread only partially train and can continue where it left off the next frame.

The principal is the same if you instead want separate processes. The main process starts up several training processes and connect to them via sockets. From the sockets, you'd poll information about the state of the program and display it. It would basically be a client-server architecture (albeit on localhost) where the training scripts are servers and the main process is a client.

Source https://stackoverflow.com/questions/64298579

QUESTION

How to efficiently share dicts and lists between processes using ProcessPool

Asked 2020-Jul-11 at 09:53

Let's consider the following example:

...

ANSWER

Answered 2020-Jul-11 at 09:53

Converting both list and dict variables to pathos.helpers.mp.Array without an intermediate pa.helpers.mp.Manager as suggested by @Mike McKerns brought the desired performance boost.

Source https://stackoverflow.com/questions/62824154

QUESTION

What is the canonical way to use locking with `pathos.pools.ProcessPool`?

Asked 2020-Jul-04 at 12:39

Let's consider the following example:

...

ANSWER

Answered 2020-Jul-04 at 12:39

pathos leverages multiprocess, which has the same interface as multiprocessing, but uses dill. You can access it either of these ways.

Source https://stackoverflow.com/questions/62727733

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install PathOS

The best way to experiment with PathOS is by accessing the instance on the University of Melbourne research cloud. Get in touch with the authors for a login account.
PathOS has been deployed in a number of clinical environments but is a large complex application with a number of interfaces to external systems that need to be integrated for full featured operation. It has been built from the ground up to meet the clinical workflow needs of the Peter MacCallum Cancer Centre and this is reflected in some of the architectural decisions. This repository can be built using the installAll.sh bash script at the top level. This script runs on Linux or OSX but would need to be customised for local requirements and ported for Windows environments.
Java JDK 1.7 from Oracle
Grails (we use 2.3.7 at time of writing) from https://github.com/grails/grails-core/releases/download/v2.3.7/grails-2.3.7.zip
Gradle (we use 1.10 at time of writing) Gradle 1.10
Git & a git ssh key with access to PathOS git repository
MySql or MariaDB (currently 5.5.50 MariaDB)
Tomcat (currently version 7.0)
The report renderer uses a commercial package available as a JAR from http://www.aspose.com/downloads/words/java. Without a license, a "No License" message will appear on generated reports.
Some of the pipeline utilities and HGVS libraries use the Genome Analysis Toolkit (Currently GATK 3.3) and the Sting utility JAR (currently 2.1.8) available from here https://software.broadinstitute.org/gatk/download/
JNI wrapper to the striped Smith-Waterman alignment library SSW see https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library

Support

Dr. Kenneth Doig, Head Clinical Informatics Lab, Research Department Peter MacCallum Cancer Centre, Victorian Comprehensive Cancer Centre, Grattan Street, Melbourne VIC 3000, Australia Ph: +61 411 225 178 Mail: ken.doig@petermac.org.

Find more information at: