polars | Fast multi-threaded, hybrid-out-of-core DataFrame library in Rust | Python | Nodejs | GPU library
kandi X-RAY | polars Summary
kandi X-RAY | polars Summary
Python Documentation | Rust Documentation | User Guide | Discord | StackOverflow.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of polars
polars Key Features
polars Examples and Code Snippets
import polars as pl
from my_polars_functions import hamming_distance
a = pl.Series("a", ["foo", "bar"])
b = pl.Series("b", ["fooy", "ham"])
dist = hamming_distance(a, b)
expected = pl.Series("", [None, 2], dtype=pl.UInt32)
# run on 2 Series
print(
import polars as pl
df = pl.DataFrame({'a': [1, None, 3, 4],
'b': [10, 20, 30, 40]
}).lazy()
print(df.collect())
shape: (4, 2)
┌──────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ i
pl.sum([exp1, exp2, etc...])
pl.fold(pl.lit(0), f=lambda c1, c2: c1 + c2, exprs =[expr1, expr2, etc...])
col_names = ["p1", "p2", "p3"]
weights = [7.4, 3.2, -0.13]
df.with_column(
pl.s
import polars as pl
import numpy as np
import plotly.express as px
df = pl.DataFrame(
{
"nrs": [1, 2, 3, None, 5],
"names": ["foo", "ham", "spam", "egg", None],
"random": np.random.rand(5),
"groups": ["
df1 = pl.DataFrame({"a": [1], "b": [2], "c": [3]})
df2 = pl.DataFrame({"a": [4], "b": [5], "c": [6]})
# new memory slab
new_df = pl.concat([df1, df2], rechunk=True)
# append free (no memory copy)
new_df = df1.vstack(df2)
# try to appen
df = pl.DataFrame({
"a": [1, 2, 3],
"b": [True, None, False]
})
df.select([
pl.lit("foo").alias("z"),
pl.all()
])
shape: (3, 3)
┌─────┬─────┬───────┐
│ z ┆ a ┆ b │
│ --- ┆ --- ┆ --- │
│ s
lag_vector = [1, 2, 3]
for lag in lag_vector:
out = (
df
.groupby_rolling(index_column="Date", period=f"{lag}w").agg(
[
pl.col('Close Returns').alias('Close Returns list'),
pl
df = pl.DataFrame({
"epoch_seconds": [1648457740, 1648457740 + 10]
})
MILLISECONDS_IN_SECOND = 1000;
df.select(
(pl.col("epoch_seconds") * MILLISECONDS_IN_SECOND).cast(pl.Datetime).dt.and_time_unit("ms").alias("datetime")
)
import polars as pl
import pandas as pd
df = pd.read_excel(...)
df_pl = pl.DataFrame(df)
my_csv = StringIO(
"""
ID,start,last_updt,end
1,2008-10-31, 2020-11-28 12:48:53,12/31/2008
2,2007-10-31, 2021-11-29 01:37:20,12/31/2007
3,2006-10-31, 2021-11-30 23:22:05,12/31/2006
"""
)
pl.read_csv(my_csv, parse_dates=True)
Community Discussions
Trending Discussions on polars
QUESTION
I know how to apply a function to all columns present in a Pandas-DataFrame. However, I have not figured out yet how to achieve this when using a Polars-DataFrame.
I checked the section from the Polars User Guide devoted to this topic, but I have not find the answer. Here I attach a code snippet with my unsuccessful attempts.
...ANSWER
Answered 2021-Jun-11 at 09:30You can use the expression syntax to select all columns with pl.col("*")
and then map
the numpy np.log2(..)
function over the columns.
QUESTION
What is the difference between Arrow IPC and Feather?
The official documentation says:
Version 2 (V2), the default version, which is exactly represented as the Arrow IPC file format on disk. V2 files support storing all Arrow data types as well as compression with LZ4 or ZSTD. V2 was first made available in Apache Arrow 0.17.0.
While vaex, a pandas alternative, has two different functions, one for Arrow IPC and one for Feather. polars, another pandas alternative, indicate that Arrow IPC and Feather are the same.
...ANSWER
Answered 2021-Jun-09 at 20:18TL;DR There is no difference between the Arrow IPC file format and Feather V2.
There's some confusion because of the two versions of Feather, and because of the Arrow IPC file format vs the Arrow IPC stream format.
For the two versions of Feather, see the FAQ entry:
What about the “Feather” file format?
The Feather v1 format was a simplified custom container for writing a subset of the Arrow format to disk prior to the development of the Arrow IPC file format. “Feather version 2” is now exactly the Arrow IPC file format and we have retained the “Feather” name and APIs for backwards compatibility.
So IPC == Feather(V2). Some places refer to Feather mean Feather(V1) which is different from the IPC file format. However, that doesn't seem to be the issue here: Polars and Vaex appear to use Feather to mean Feather(V2) (though Vaex slightly misleadingly says "Feather is exactly represented as the Arrow IPC file format on disk, but also support compression").
Vaex exposes both export_arrow
and export_feather
. This relates to another point of Arrow, as it defines both an IPC stream format and an IPC file format. They differ in that the file format has a magic string (for file identification) and a footer (to support random access reads) (documentation).
export_feather
always writes the IPC file format (==FeatherV2), while export_arrow
lets you choose between the IPC file format and the IPC stream format. Looking at where export_feather
was added I think the confusion might stem from the PyArrow APIs making it obvious how to enable compression with the Feather API methods (which are a user-friendly convenience) but not with the IPC file writer (which is what export_arrow
uses). But ultimately, the format being written is the same.
QUESTION
I'm using polars and I would like to define the type of the columns while loading a dataframe. In pandas, I can use dtype
:
ANSWER
Answered 2021-Apr-17 at 07:19The with_schema
method expects an Arc
type, not a Hashmap
.
The following code works:
QUESTION
I am displaying the distribution of data points which I have transformed to polar coordinates, and am displaying the distribution of points using a histogram. How do I change the x axes to be in multiples of pi?
...ANSWER
Answered 2020-Oct-27 at 11:29Needed to add:
QUESTION
I'd like to draw two lines at 88 and 84 degrees north on a cartopy north polar stereo map, but am stumped as to how to do it.
I've tried with:
...ANSWER
Answered 2020-Mar-31 at 16:32This should be available in the next release (0.18). You can test it out if you build/install CartoPy from git master.
QUESTION
I'm using Cartopy for my polar research and would like to clip a circular boundary around my data, which I plot in the NorthPolarStereo()
projection. I use set_extent
to indicate from what latitude I would like to plot my data and use set_boundary
for creating a circular boundary as explained in the gallery. I then use matplotlib.pyplot.pcolormesh
to plot the actual data. However, say I use set_extent
to define a minimum latitude of 55 degrees, some of my data below 55 degrees is still being plotted outside of my set_boundary
. How do I clip off this data?
ANSWER
Answered 2019-Jun-18 at 12:55I don't have cartopy to test it in the same conditions as you, but you can clip a pcolormesh using a Patch object of any shape:
QUESTION
Thanks to the answer to this question I can plot the geopandas world map with continents and oceans coloured in different projections.
Now I would like to add some points, e.g. the cities included in geopandas
...ANSWER
Answered 2019-Apr-16 at 14:37The default drawing order for axes
is patches, lines, text. This order is determined by the zorder attribute.
QUESTION
Adding a further requirement to this question, I also need to have the oceans in blue (or any other colour).
For the 'PlateCarree' projection I can simply do this
...ANSWER
Answered 2019-Apr-12 at 16:10You need to plot the map geometries on Cartopy geoaxes
, and use cartopy.feature.OCEAN
to plot the ocean. Here is the working code that you may try. Read the comments in the code for clarification.
QUESTION
I want to use the geopandas included low resolution world map (see here) as a background for my data. This works fine as long as I use e.g. 'PlateCarree' projection.
If I now want to use a polar stereographic peojection
...ANSWER
Answered 2019-Apr-12 at 12:27When plotting with a specific cartopy projection, it is best to actually create the matplotlib figure and axes using cartopy, to make sure it is aware of the projection (in technical terms: to make sure it is a GeoAxes
, see https://scitools.org.uk/cartopy/docs/latest/matplotlib/intro.html):
QUESTION
I have a Netcdf dataset with dimensions [time, height, latitude, longitude] that I've opened with xarray. I've written a code that projects the all the data for a specific timestamp onto a cartopy map and saves the image to my directory. I'd like to create an image for each timestamp, but at the moment the only way I know how to do it is to manually change the timestamp entry and run the code again. Since there are 360 timestamps this would obviously take some time. I know Python is handy for loops, but I'm very unfamiliar with them, so is there a way of embedding this code within a loop so that I can save multiple images in one go?
...ANSWER
Answered 2019-Feb-02 at 19:14It seems straightforward to put this in a loop so I am not sure why this is so difficult. Nevertheless, you can try the following. Here I have moved some definitions outside the for loop because you don't need to define them 360 times again and again.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install polars
You can take latest release from crates.io, or if you want to use the latest features / performance improvements point to the master branch of this repo. Required Rust version >=1.58.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page