lifelines | Survival analysis in Python | Machine Learning library

 by   CamDavidsonPilon Python Version: 0.28.0 License: MIT

kandi X-RAY | lifelines Summary

kandi X-RAY | lifelines Summary

lifelines is a Python library typically used in Artificial Intelligence, Machine Learning applications. lifelines has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can install using 'pip install lifelines' or download it from GitHub, PyPI.

What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical community. Its purpose was to answer why do events occur now versus later under uncertainty (where events might refer to deaths, disease remission, etc.). This is great for researchers who are interested in measuring lifetimes: they can answer questions like what factors might influence deaths?.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              lifelines has a highly active ecosystem.
              It has 2110 star(s) with 535 fork(s). There are 71 watchers for this library.
              There were 1 major release(s) in the last 6 months.
              There are 244 open issues and 664 have been closed. On average issues are closed in 36 days. There are 9 open pull requests and 0 closed requests.
              It has a positive sentiment in the developer community.
              The latest version of lifelines is 0.28.0

            kandi-Quality Quality

              lifelines has 0 bugs and 0 code smells.

            kandi-Security Security

              lifelines has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              lifelines code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              lifelines is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              lifelines releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              lifelines saves you 7163 person hours of effort in developing the same functionality from scratch.
              It has 15109 lines of code, 1323 functions and 73 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed lifelines and discovered the below as its top functions. This is intended to give you an instant insight into lifelines implemented functionality, and help decide if they suit your requirements.
            • Fit the model
            • Add intercept column
            • Convert x to a list
            • Transform a dataframe into a dataframe
            • Fit a CoxPhFitter
            • Compute central_stats
            • Compute standard errors
            • Returns a semi - parametric pmfitter
            • Fit an interval censored model
            • Print a summary of the data table
            • Plot a qq plot
            • Return a pandas Series containing the percentile of the model
            • Print a summary of the model
            • Compute the optimal gradient of the regression
            • Fit left censored distribution
            • Print a summary table
            • Plot the model
            • Predict the hazard of a hazard
            • Fit the Normalization model
            • Compute the survival difference between a fixed point and a fixed point
            • Estimate the Bayesian Estimator estimator
            • Fit the CoxPhFitter
            • Add a covariance matrix to a dataframe
            • Compute the pairwise log rank test
            • Fit the regression model to a DataFrame
            • Plot interval censored lifetimes
            Get all kandi verified functions for this library.

            lifelines Key Features

            No Key Features are available at this moment for lifelines.

            lifelines Examples and Code Snippets

            Example
            Pythondot img1Lines of Code : 46dot img1no licencesLicense : No License
            copy iconCopy
            import lifelines
            from sklearn_lifelines.estimators_wrappers import CoxPHFitterModel
            from sklearn.pipeline import make_pipeline
            from sklearn.cross_validation import train_test_split
            from patsylearn import PatsyTransformer
            
            data = lifelines.datasets.lo  
            Code Visualizer,Configuration for Your Project
            Javadot img2Lines of Code : 35dot img2License : Permissive (Apache-2.0)
            copy iconCopy
            {
              "override": false,
              "lifelines": [
                {
                  "name": ".*Controller.*",
                  "annotation": ".*Controller",
                  "entryPoint": true
                },
                {
                  "name": ".*Service.*"
                },
                {
                  "name": ".*Repository.*",
                  "annotation": ".*(R  
            DESCRIPTION
            Perldot img3Lines of Code : 6dot img3no licencesLicense : No License
            copy iconCopy
            Perl 5.005 or later
            ActivePerl5 Build Number 520 or later has been reported to work
            
            Date::Manip.pm       to work with dates
            Text::Soundex.pm     to use soundex
            Parse::RecDescent.pm to use lines2perl
            Roman.pm             to use the LifeLines function  

            Community Discussions

            QUESTION

            Python CoxPHFitter to extract hazard ratio and confidence intervals
            Asked 2022-Mar-15 at 12:28

            I am new to survival analysis and I have been reading many research paper where the authors report adjusted (age and gender) and unadjusted hazard ratios along with confidence intervals. I am currently using CoxPHFitter from lifelines python package but I am unable to extract hazard ratios. I have followed many links e.g. https://databricks.com/notebooks/survival_analysis/survival_analysis_03_modeling_hazards.html and https://towardsdatascience.com/survival-analysis-part-a-70213df21c2e but none of them give any details on how to extract hazard ratio along with confidence intervals for adjusted or unadjusted cox regression. Using the "baseline_hazard" does give hazard ratio for the intervals but no confidence interval (I am not sure whether this is the right variable to look at) The "confidence_intervals" provides the confidence intervals of the covariates but I am looking for hazard ratio of the fitted model. Can anyone please help me with this? I am new to this analysis

            ...

            ANSWER

            Answered 2022-Mar-15 at 12:28

            The hazard ratios (labelled exp(coef)) and confidence intervals are available in the cph.summary, and in a prettier format with cph.print_summary().

            Source https://stackoverflow.com/questions/71475809

            QUESTION

            Relations between objects involved in an UML communication diagram
            Asked 2021-Nov-09 at 22:48
            In short

            In an UML communication diagram, the objects that interact with each other are visually connected with lines along which sequenced messages can circulate in both directions. Lines between objects usually represent links, i.e. instances of an association. But objects can exchange message even if they are not associated (e.g. if one object is passed as parameter or return to the other).

            I wish to represent such a message exchange between objects that are not associated. But I would like to disambiguate and clarify that there is no direct association between the involved objects. Do the UML specs allow to express this without creating user-defined stereotypes?

            Moreover do the current UML specifications define somewhere a term for the relation between objects in a communication diagram that interact ? And is it possible to further specify in the diagram how the communicating objects know about each other?

            More research done before asking the question

            I am currently re-reading "The UML User Guide, 2nd edition" by Grady Booch, James Rumbaugh and Ivar Jacobson, a great book that explains the UML specifications in reader-friendly plaintext. It's the updated UML 2 edition of the book and I could map back to the UML 2.5.1 specifications most of their claims.

            In chapter 16 on interactions however, they explain that objects communicate along links:

            A link specifies a path along which one object can dispatch a message to another (or the same) object. (...) If you need to be more precise about how that path exists, you can adorn the appropriate end of the link with one of the following constraints:

            • association: (...) object is visible by association
            • self: (...) object is visible because it is the dispatcher of the operation
            • global: (...) object is visible because it is in an enclosing scope
            • local: (...) object is visible because it is in a local scope
            • parameter: (...) object is visible because it is a parameter

            In chapter 19, they explain for communication diagrams that objects involved in the interaction are shown interlinked:

            you render the links that connect these objects as the arcs of this graph. The links may have rolenames to identify them. Finally, you adorn these links with the messages that objects send and receive.

            This seemed straightforward. So I looked for the corresponding UML 2.5.1 specifications:

            • Links are solely defined as instances of associations.
            • For communication diagrams, there is no mention at all in section 17.9 of links nor communication channels. Figure 17.26 shows interlinked objects (i.e. lifelines), but the lines linking the objects AND the arrows representing the messages along this line seem both to be graphically defined as "messages". This seems very ambiguous to me.
            • Moreover I have not found any reference to constraints, keywords or pre-defined stereotypes that could describe how the objects know about each other in a particular diagram and that could justify a communication channel between them.
            • I could find «association»,«local»,«global»,«parameter» (with the same meaning as above) defined as stereotypes in the obsolete UML 1.4 specification, but no longer in the current specs.

            Hence my question.

            ...

            ANSWER

            Answered 2021-Nov-09 at 22:48

            The specification doesn't say this explicitely, but I think this is the definition of link used:

            Two objects are said to be linked, when they can interact with each other.

            There are several places in the specification that talk about links. The most general one I found in 11.2.3.3

            Each link may be realized by something as simple as a pointer or by something as complex as a network connection, and may represent the possibility of instances being able to communicate because their identities are known by virtue of being passed in as parameters, held in variables or slots, or even because the communicating instances are the same instance.

            From this I derived my simple definition.

            So, I disagree, links are not "solely defined as instances of associations". This misunderstanding probably stems from the fact, that specifying links with InstanceSpecifications is only possible for Associations: Each InstanceSpecification that wants to have slots for the linked InstanceSpecifications must have an Association as Classifier. But - links can be there even if they are not specified. And they can be specified by other model elements.

            These other link specifying model elements include connectors in composite structure diagrams or messages in interaction diagrams or attributes and parameters in class diagrams.

            Communication diagrams are just a way to show an interaction. They have the same underlying metamodel instances as a sequence diagram. Therefore, even though there are lines in between the lifelines, there is no corresponding model element. They are just there to have a place for attaching the message symbols (see Table 17.4 Graphic Paths Included in Communications Diagrams).

            It is possible for a message to reference a connector. So, you can view the line in the communication diagram as a visual representation of this connector. However, it is not necessary to model the connector explicitely. There are tools that will require a connector, but in the specification the connector is optional.

            As you say, there are no stereotypes to describe how objects know each other. You would need to define your own. However, there are other possibilities: A Lifeline represents a connectable element, which can either be a parameter or an attribute (or a variable, but nobody uses them). By looking at the represented attribute, you can find out, whether it is a global or local attribute and whether it is an association end.

            Having said that, there is a catch. Officially all represented elements must belong directly or indirectly to the context class of the interaction (same_classifier constraint in 17.12.17.5). In most cases I encounter, this context is just the system. This means, all attributes must be global. I had a long conversation with the author of this section, and found out, that the idea was, that an interaction should belong to a collaboration and the connection to the system is via CollaborationUses and roleBindings. This is the reason, why MagicDraw automatically creates a Collaboration for each Interaction.

            This adds another level of indirection, which solves, as we know ;-) all problems in computer science. Fortunately, most tools don't enforce this rule, so we are free to let our lifelines represent attributes and parameters of real components of our systems.

            Source https://stackoverflow.com/questions/69872921

            QUESTION

            Private string becomes null when called from another method
            Asked 2021-Aug-23 at 03:45

            I am making a 'Who Wants To Be A Millionaire' game. I am at the later stages, and I am trying to store the user names and scores, but when I call getUser() from another method it comes up as null, I printed the value of getUser() out to make sure it actually storing the values. Here is my code

            ...

            ANSWER

            Answered 2021-Aug-22 at 15:36

            Where do you call storeUserHighscore method ?

            Also, I believe the first if statement is un-necessary.

            refactor to be like this.

            Source https://stackoverflow.com/questions/68881397

            QUESTION

            Reindexing error when appending dataframes
            Asked 2021-Jul-29 at 15:21

            I am trying to append two data frames.

            The dataframes have duplicate columns which should be merged into one with new values added as extra rows.

            ...

            ANSWER

            Answered 2021-Jul-29 at 10:18

            There is problem with duplicated columns names like here a, a columns, solution for deduplicated is use GroupBy.cumcount with DataFrame.stack for all DataFrames with duplicates:

            Source https://stackoverflow.com/questions/68573937

            QUESTION

            How to solve "TypeError: __array__() takes 1 positional argument but 2 were given" for Tensorflow CNN model?
            Asked 2021-Jul-01 at 03:19

            I am building a multi-input, single-output CNN using Keras's functional API. There are SMILE data inputs which are 1D sequences and Proteins, which are also 1D sequences. I have the following data types and structures:

            ...

            ANSWER

            Answered 2021-Jul-01 at 03:19

            problem is with the input dimensions, Conv1D expects input of shape 3D but in this case it 2D.

            Conv1D needs to have a 3D shape data [batch_size, time_steps, feature_size]

            Example,If we provide for each of the 50 batch samples, for each of the 2 time steps, a 100 dimensional vector: Input data shape should be something like,

            Source https://stackoverflow.com/questions/68046811

            QUESTION

            Installed a package on command line and can import it via command line. Receive ModuleNotFoundError when importing in jupyter notebook
            Asked 2021-May-21 at 12:36

            I installed the python package lifelines on my terminal. The windows terminal is my terminal of choice, with a powershell and anaconda terminal that I often used.

            I tried installing the package using the provided commands in the documentation:

            pip install lifelines and conda install -c conda-forge lifelines

            Both times the installation is marked as successfull. When I run Python within the terminal I can import the lifelines package without problem. However whem I import it on a jupyter notebook it yields a ModuleNotFoundError.

            The base environment I use does not contain the lifelines package when I verify its contents using the Anaconda Navigator.

            ...

            ANSWER

            Answered 2021-May-21 at 12:33

            The jupyter notebooks are run on Anaconda Powershell, and so are the environments and packages.

            Installing on the Windows Powershell will never work. Running the conda install -c conda-forge lifelines in the Anaconda shell solved the issue.

            So silly, yet so time consuming it is worth sharing.

            Source https://stackoverflow.com/questions/67636946

            QUESTION

            How do I find the best categorization for survival analysis?
            Asked 2021-May-19 at 12:44

            I have a question concerning survival analysis. However, I have the following data (just an excerpt):

            Now I am trying to do Survival Analysis with Python lifelines package. For example I want to find out if T-cells influence the Overall Survival (OS). But as far as I know, I need to categorizie the numer of T cells in different categories, like e.g. High T-Cell and Low T-Cell... Is that right? But how do I find out the best fitting Cut-Out? My plan is to show, that Tumor with High T-Cells have a better survival than low T-Cells. But how could I find the best cut-off-value to discriminate between High and Low T-Cell out of the data I have here.

            Does anyone has an idea? A friend of mine said something about "ROC"-Analysis but I am really confused now... I would be glad about any help!

            ...

            ANSWER

            Answered 2021-May-19 at 12:44

            The transformation of continuous variables into categorical variables is far from obvious. A first approach can be based on the existing literature, especially in medicine/biology. A review of the existing literature may be sufficient to create these classes. Another method can be based on the empirical distribution of the T-Cells variable, sometimes highlighting an "obvious" categorization. The use of an ROC curve can be a good idea but somehow I don't think it is necessary. Categorizing your variable in Kaplan-Meier type survival analyses is necessary, but if you use Cox models there is no need to categorize this variable. So I would advise you to turn to Cox regressions to conduct your survival analysis. A Cox regression would allow you to add several predictors in your modeling as well as interaction terms, which is more convenient.

            Source https://stackoverflow.com/questions/66813335

            QUESTION

            How can estimate cox model with lifelines package?
            Asked 2021-Feb-24 at 16:46

            I want to estimate cox models but when I try to run the code ,I have an error. it seems this problem about the coxphfitter().does any one here that solve this problem. I think the lifelines library can not compute coefficients with ML method .So here I copy errors and sample code .I should to say I write the code just for example and inputs not reall.

            code

            ...

            ANSWER

            Answered 2021-Feb-12 at 12:18

            The given clearly states the problem:

            ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please see the following tips in the lifelines documentation: https://lifelines.readthedocs.io/en/latest/Examples.html#problems-with-convergence-in-the-cox-proportional-hazard-modelMatrix is singular.

            Without the real data I can't give any further advice. But the lifelines documentation gives a lot of advice on this issue:

            Convergence halted due to matrix inversion problems: This means that there is high collinearity in your dataset. That is, a column is equal to the linear combination of 1 or more other columns. A common cause of this error is dummying categorical variables but not dropping a column, or some hierarchical structure in your dataset. Try to find the relationship by: adding a penalizer to the model, ex: CoxPHFitter(penalizer=0.1).fit(…) until the model converges. In the print_summary(), the coefficients that have high collinearity will have large (absolute) magnitude in the coefs column. using the variance inflation factor (VIF) to find redundant variables. looking at the correlation matrix of your dataset, or

            This is very likely not an error caused by lifelines instead it is your data or how you apply the model on your data.

            Source https://stackoverflow.com/questions/65926665

            QUESTION

            How to indicate Kaplan-Meier Fitter (python) to plot 90% of datapoints to avoid sudden drops
            Asked 2020-Dec-07 at 18:36

            I am writing some python code to do Kaplan-Meier (KM) curves using the KM Fitter and usually plot 4 curves in the same graph to compare different groups. The basic way to get a KM curve is:

            from lifelines import KaplanMeierFitter

            #Create the KMF object
            KM_curve = KaplanMeierFitter()

            #Give data to object. Status is 0 if alive, 1 if deceased (in my case)
            KM_curve.fit (durations=My_Data["Time"], event_observed=My_Data["Status"])

            #I do a figure in which I use this line 4 times (one per group)
            KM_curve.plot(ci_show=False)

            With those 4 lines of code and a pandas dataframe (here called My_Data) the KM Fitter automatically does all the calculations and plotting, but I was wondering if anyone knows how to stop the curve prematurely. I have done around 50 different graphs, they look nice and give me the info I need, but sometimes the last part of some curves dramatically drops to 0% (vertically) or very close to it. That is weird since none of my groups has 0 survivors at the end of my x-axis [See in this example, the red line https://i.stack.imgur.com/bn6Vy.png ]

            I did read that the KM curves are good to see trends in the middle section, but the last part of the curves may be misleading and has to be examined carefully. That is especially true if there are not enough patients left in that group and thus, the %survival estimate drops dramatically. Someone who does bioinformatics told me she usually stops plotting the curve whenever 10% of patients are left, to prevent this issue. Is it possible to do that in python KMF?

            ...

            ANSWER

            Answered 2020-Dec-07 at 18:36

            There are few ways to achieve this:

            1.

            Source https://stackoverflow.com/questions/65154800

            QUESTION

            Getting Concordance result of lifelines CoxPH model in a dataframe
            Asked 2020-Oct-20 at 10:29

            I am using CoxPH implementation of lifelines package in python. Currently, results are in tabular view of coefficients and related stats and can be seen with print_summary(). Here is an example

            ...

            ANSWER

            Answered 2020-Jun-25 at 13:20

            you can access the c-index with cph.concordance_index_ - and you could put this into a list or dataframe if you wish.

            Source https://stackoverflow.com/questions/62550463

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install lifelines

            You can install using 'pip install lifelines' or download it from GitHub, PyPI.
            You can use lifelines like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            If you are new to survival analysis, wondering why it is useful, or are interested in lifelines examples, API, and syntax, please read the Documentation and Tutorials page.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install lifelines

          • CLONE
          • HTTPS

            https://github.com/CamDavidsonPilon/lifelines.git

          • CLI

            gh repo clone CamDavidsonPilon/lifelines

          • sshUrl

            git@github.com:CamDavidsonPilon/lifelines.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Machine Learning Libraries

            tensorflow

            by tensorflow

            youtube-dl

            by ytdl-org

            models

            by tensorflow

            pytorch

            by pytorch

            keras

            by keras-team

            Try Top Libraries by CamDavidsonPilon

            lifetimes

            by CamDavidsonPilonPython

            tdigest

            by CamDavidsonPilonPython

            PyProcess

            by CamDavidsonPilonPython

            StartupFiles

            by CamDavidsonPilonPython