causality | Tools for causal analysis | Analytics library

by akelleh Python Version: 0.0.9 License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | causality Summary

causality is a Python library typically used in Analytics applications. causality has no bugs, it has no vulnerabilities, it has build file available and it has high support. You can install using 'pip install causality' or download it from GitHub, PyPI.

This package contains tools for causal analysis using observational (rather than experimental) datasets.

Support

Quality

Security

License

Reuse

Support

causality has a highly active ecosystem.

It has 875 star(s) with 110 fork(s). There are 56 watchers for this library.

It had no major release in the last 12 months.

There are 14 open issues and 11 have been closed. On average issues are closed in 83 days. There are 8 open pull requests and 0 closed requests.

It has a positive sentiment in the developer community.

The latest version of causality is 0.0.9

Quality

causality has 0 bugs and 0 code smells.

Security

causality has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

causality code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

causality does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

causality releases are available to install and integrate.

Deployable package is available in PyPI.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

causality saves you 513 person hours of effort in developing the same functionality from scratch.

It has 1205 lines of code, 97 functions and 18 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed causality and discovered the below as its top functions. This is intended to give you an instant insight into causality implemented functionality, and help decide if they suit your requirements.

Calculates the expected value
PDF of the covariance function
Estimate the model
Estimate the imbalance of each confound
Calculate the imbalance ratio
Estimate the ATC coefficient
Determines if the given assignment matches the given assignment
Calculate the confidence interval for a given distribution
Estimate the weight of an effect
Estimate the predicted effect probability
Wrapper for plot
Compute z mean
Calculate the bootstrapped mean plot
Calculates the bootstrap statistic for a given dataframe
Compute the integration function
Evaluate the expectation function
Returns a set of adjacables that satisfy the given effect
Return whether the assumption is satisfied
Finds predecessors of given causes

Get all kandi verified functions for this library.

causality Key Features

No Key Features are available at this moment for causality.

causality Examples and Code Snippets

No Code Snippets are available at this moment for causality.

Community Discussions

Trending Discussions on causality

Making foreach go through all values of a variable + return the current value

Fast Granger Test/ Wald Test on large number of samples

Reason for the name of the "store buffer" litmus test on x86 TSO memory model

Granger Causality with multiple countries in Stata / R

Can a thread first acquire an object via safe publication and then publish it unsafely?

Pandas Filter out rows according to titles similarities

Allowing for aliased coefficients when running `grangertest()` in R

granger causality analysis for pairwise variables in dataframe using R

How to create a column labeling rows based on conditions?

ARIMAX exogenous variables reverse causality

QUESTION

Making foreach go through all values of a variable + return the current value

Asked 2021-Dec-16 at 08:22

I want to build an iteration using the Stata FAQ here. The second method seems fitting for my case. I have built the following code:

...

ANSWER

Answered 2021-Dec-15 at 18:22

display of the current level is easy enough, say:

Source https://stackoverflow.com/questions/70364631

QUESTION

Fast Granger Test/ Wald Test on large number of samples

Asked 2021-Nov-22 at 13:41

I have two vectors x and y

...

ANSWER

Answered 2021-Nov-22 at 13:34

These procedures fail because waldtest() is looking for attributes/methods that don't exist in/for fitted model objects returned by fast.lm() and lm() with data.table input and I don't believe there is an easy fix.

However, if you want to test whether the model including y and a constant is "better" than the constant-only model you can use summary() (which has a method for objects of class fast.lm).

E.g. from the last line of summary(fit_3) you'll get

Source https://stackoverflow.com/questions/70066147

QUESTION

Reason for the name of the "store buffer" litmus test on x86 TSO memory model

Asked 2021-Sep-11 at 08:47

I've been studying the memory model and saw this (quote from https://research.swtch.com/hwmm):

...

ANSWER

Answered 2021-Sep-11 at 08:47

It makes some sense to call StoreLoad reordering an effect of the store buffer because the way to prevent it is with mfence or a locked instruction that drains the store buffer before later loads are allowed to read from cache. Merely serializing execution (with lfence) would not be sufficient, because the store buffer still exists. Note that even sfence ; lfence isn't sufficient.

Also I assume P5 Pentium (in-order dual-issue) has a store buffer, so SMP systems based on it could have this effect, in which case it would definitely be due to the store buffer. IDK how thoroughly the x86 memory model was documented in the early days before PPro even existed, but any naming of litmus tests done before that might well reflect in-order assumptions. (And naming after might include still-existing in-order systems.)

You can't tell which effect caused StoreLoad reordering. It's possible on a real x86 CPU (with a store buffer) for a later load to execute before the store has even written its address and data to the store buffer.

And yes, executing a store just means writing to the store buffer; it can't commit from the SB to L1d cache and become visible to other cores until after the store retires from the ROB (and thus is known to be non-speculative).

(Retirement happens in-order to support "precise exceptions". Otherwise, chaos ensues and discovering a mis-predict might mean rolling back the state of other cores, i.e. a design that's not sane. Can a speculatively executed CPU branch contain opcodes that access RAM? explains why a store buffer is necessary for OoO exec in general.)

I can't think of any detectable side-effect of the load uop executing before the store-data and/or store-address uops, or before the store retires, rather than after the store retires but before it commits to L1d cache.

You could force the latter case by putting an lfence between the store and the load, so the reordering is definitely caused by the store buffer. (A stronger barrier like mfence, a locked instruction, or a serializing instruction like cpuid, will all block the reordering entirely by draining the store buffer before the later load can execute. As an implementation detail, before it can even issue.)

A normal out of order exec treats all instructions as speculative, only becoming non-speculative when they retire from the ROB, which is done in program order to support precise exceptions. (See Out-of-order execution vs. speculative execution for a more in-depth exploration of that idea, in the context of Intel's Meltdown vulnerability.)

A hypothetical design with OoO exec but no store buffer would be possible. It would perform terribly, with each store having to wait for all previous instructions to be definitively known to not fault or otherwise be mispredicted / mis-speculated before the store can be allowed to execute.

This is not quite the same thing as saying that they need to have already executed, though (e.g. just executing the store-address uop of an earlier store would be enough to know it's non-faulting, or for a load doing the TLB/page-table checks will tell you it's non-faulting even if the data hasn't arrived yet). However, every branch instruction would need to be already executed (and known-correct), as would every ALU instruction like div that can.

Such a CPU also doesn't need to stop later loads from running before stores. A speculative load has no architectural effect / visibility, so it's ok if other cores see a share-request for a cache line which was the result of a mis-speculation. (On a memory region whose semantics allow that, such as normal WB write-back cacheable memory). That's why HW prefetching and speculative execution work in normal CPUs.

The memory model even allows StoreLoad ordering, so we're not speculating on memory ordering, only on the store (and other intervening instructions) not faulting. Which again is fine; speculative loads are always fine, it's speculative stores that we must not let other cores see. (So we can't do them at all if we don't have a store buffer or some other mechanism.)

(Fun fact: real x86 CPUs do speculate on memory ordering by doing loads out of order with each other, depending on addresses being ready or not, and on cache hit/miss. This can lead to memory order mis-speculation "machine clears" aka pipeline nukes (machine_clears.memory_ordering perf event) if another core wrote to a cache line between when it was actually read and the earliest the memory model said we could. Or even if we guess wrong about whether a load is going to reload something stored recently or not; memory disambiguation when addresses aren't ready yet involves dynamic prediction so you can provoke machine_clears.memory_ordering with single-threaded code.)

Out-of-order exec in P6 didn't introduce any new kinds of memory re-ordering because that could have broken existing multi-threaded binaries. (At that time mostly just OS kernels, I'd guess!) That's why early loads have to be speculative if done at all. x86's main reason for existence it backwards compat; back then it wasn't the performance king.

Re: why this litmus test exists at all, if that's what you mean?
Obviously to highlight something that can happen on x86.

Is StoreLoad reordering important? Usually it's not a problem; acquire / release synchronization is sufficient for most inter-thread communication about a buffer being ready to read, or more generally a lock-free queue. Or to implement mutexes. ISO C++ only guarantees that mutexes lock / unlock are acquire and release operations, not seq_cst.

It's pretty rare that an algorithm depends on draining the store buffer before a later load.

Say I somehow observed this litmus test on an x86 machine,

Fully working program that verifies that this reordering is possible in real life on real x86 CPUs: https://preshing.com/20120515/memory-reordering-caught-in-the-act/. (The rest of Preshing's articles on memory ordering are also excellent. Great for getting a conceptual understanding of inter-thread communication via lockless operations.)

Source https://stackoverflow.com/questions/69112020

QUESTION

Granger Causality with multiple countries in Stata / R

Asked 2021-Aug-27 at 09:29

I have read several papers claiming to compare multiple entities (firms, countries, etc.) at the same time using Granger Causality. They usually group countries in east, west, north and south or urban / non urban (for example).

E.g. the public dataset by "Grunfeld" in the plm package has the following form:

I have a similar dataset which I aim to explore using Granger Causality. My time variable is from 1990 to 2020.

country_ID country_type year sum var_A var_B ... 1 1 2000 323 32 213 2 1 2000 13 0 7 3 2 2000 12 7 0 4 1 2000 0 0 0 5 2 2000 323 13 56

Stata does not return anything useful running this through, I suppose it's because of too many countries.

Is it even possible to take a subset of my dataset, say country_type = 1 and run a Granger over the countries included? Or do I have to sum up all countries in country_type in order to run Granger?

One paper I looked into is (p.1310): "The Causality Analysis between Transportation and Regional Development in City-level China" by He et al. (2019).

...

ANSWER

Answered 2021-Aug-27 at 09:29

You do not write which of Stata's commands you use. If you have panel data, you may want to use the panel Granger (non-)causality test by Dumitrescu/Hurlin (2012), implemented in Stata's user contributed command xtgcause and in R's plm package as pgrangertest.

Below is an example of how to use pgrangertest with the Grunfeld dataset in R. The help page has references to the literature and some more information on how use more options (?pgrangertest):

Source https://stackoverflow.com/questions/68938756

QUESTION

Can a thread first acquire an object via safe publication and then publish it unsafely?

Asked 2021-Feb-23 at 09:09

This question came to me after reading this answer.

Code example:

...

ANSWER

Answered 2021-Feb-23 at 01:56

Partial answer: how "unsafe republication" works on OpenJDK today.
(This is not the ultimate general answer I would like to get, but at least it shows what to expect on the most popular Java implementation)

In short, it depends on how the object was published initially:

if initial publication is done through a volatile variable, then "unsafe republication" is most probably safe, i.e. you will most probably never see the object as partially constructed
if initial publication is done through a synchronized block, then "unsafe republication" is most probably unsafe, i.e. you will most probably be able to see object as partially constructed

Most probably is because I base my answer on the assembly generated by JIT for my test program, and, since I am not an expert in JIT, it would not surprise me if JIT generated totally different machine code on someone else's computer.

For tests I used OpenJDK 64-Bit Server VM (build 11.0.9+11-alpine-r1, mixed mode) on ARMv8.
ARMv8 was chosen because it has a very relaxed memory model, which requires memory barrier instructions in both publisher and reader threads (unlike x86).

1. Initial publication through a volatile variable: most probably safe

Test java program is like in the question (I only added one more thread to see what assembly code is generated for a volatile write):

Source https://stackoverflow.com/questions/66214096

QUESTION

Pandas Filter out rows according to titles similarities

Asked 2021-Feb-13 at 10:03

I have a data frame with a column named title, I want to apply textdistance to check similarities between different titles and remove any rows with similar titles (based on a specific threshold). Is there away to do that directly, or I need to define a custom function and group similar titles togother before removing "duplicates" (titles that are similar)? A sample would look like this.

...

ANSWER

Answered 2021-Feb-13 at 10:03

So I have done it in a different way. I have created a column to mask which rows to keep and to delete. I accessed the target row and checked the similarity with the rows below it.

Source https://stackoverflow.com/questions/66111317

QUESTION

Allowing for aliased coefficients when running `grangertest()` in R

Asked 2021-Jan-17 at 06:58

I'm currently trying to run a granger causality analysis in R/R Studio. I am receiving errors about aliased coefficients when using the function grangertest(). From my understanding, this occurs because there is perfect multicolinearity between the variables.

Due to having a very large number of pairwise comparisons (e.g. 200+), I would like to simply run the granger with the aliased coefficients as per normal rather than returning an error. According to one answer here, the solution is or was to add set singular.ok=TRUE, but either I am doing it incorrectly the answer s out of date. I've tried checking the documentation, but have come up empty. Any help would be appreciated.

...

ANSWER

Answered 2021-Jan-17 at 06:58

After some investigation and emailing the creator of the grangertest package, they sent me this solution. The solution should run on aliased variables when granger test does not. When the variables are not aliased, the solution should give the same values as the normal granger test.

Source https://stackoverflow.com/questions/65712297

QUESTION

granger causality analysis for pairwise variables in dataframe using R

Asked 2021-Jan-08 at 01:15

I am wanting to conduct granger causality between several different pairs of variables for country year data. I seem to be able to get it working outside of a loop/function, but am having some trouble integrating it with the remainder of my code. I've provided code for a minimal working example and desired output below. Any help would be greatly appreciated. Thankyou in advance!

EDIT: I should have specified more clearly in the original post. Each column in this generated data contains time series data for multiple countries. I want to average across the countries and then perform granger on those variables) using the same method detailed below

Code to Simulate Time Series Data

...

ANSWER

Answered 2021-Jan-08 at 01:15

You can apply the function to all combinations in combn function itself, so no need to have a separate lapply call.

Source https://stackoverflow.com/questions/65607201

QUESTION

How to create a column labeling rows based on conditions?

Asked 2020-Dec-24 at 11:02

I'm having a problem where I want to create a column of 4 labels. However, when I try to create these, the labels I make eat into re-labeling the first label I have assigned. For example I am looking to create a label column like this:

...

ANSWER

Answered 2020-Sep-21 at 10:28

You can use case_when or nested ifelse statements so that every row will satisfy only one condition based on their occurrence.

Source https://stackoverflow.com/questions/63989976

QUESTION

ARIMAX exogenous variables reverse causality

Asked 2020-Nov-26 at 21:11

I try to fit an ARIMAX model to figure out whether the containment measures (using the Government response stringency index, numbers from 0 to 100) are having a significant effect on the daily new cases rate. I also want to add test rates. I programmed everything in R (every ts is stationary,...) and did the Granger causality test. Result: Pr(>F)is greater than 0.05. Therefore the null hypothesis of NO Granger causality can be rejected and the new cases rate and the containment measures have reverse causality. Is there any possibility to transform the variable "stringency index" and continue with an ARIMAX model? If so, how to do this in R?

...

ANSWER

Answered 2020-Nov-26 at 21:11

In R you have "forecast" package to build ARIMA models. Recall, that there is a difference between true ARIMAX models and linear regressions with ARIMA errors. Check this post by Rob Hyndman (forecast package author) for more detailed information: The ARIMAX model muddle

Here are Rob Hyndman's examples to fit a linear regression with ARIMA errors - check more information here:

Source https://stackoverflow.com/questions/65019469

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install causality

Assuming you have pip installed, just run. The simplest interface to this package is probably through the CausalDataFrame object in causality.analysis.CausalDataFrame. This is just an extension of the pandas.DataFrame object, and so it inherits the same methods.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: