Popular New Releases in Pandas
scikit-learn
scikit-learn 1.0.2
pandas
Pandas 1.4.1
streamlit
1.8.1
seaborn
v0.11.2 (August 2021)
notebook
v7.0.0a2
Popular Libraries in Pandas
by scikit-learn python
49728 BSD-3-Clause
scikit-learn: machine learning in Python
by pandas-dev python
33259 BSD-3-Clause
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
by jakevdp jupyter notebook
32215 NOASSERTION
Python Data Science Handbook: full text in Jupyter Notebooks
by streamlit python
18692 Apache-2.0
Streamlit — The fastest way to build data apps in Python
by Pierian-Data jupyter notebook
17053
Course Files for Complete Python 3 Bootcamp Course on Udemy
by ageron jupyter notebook
16775 Apache-2.0
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
by wesm jupyter notebook
15489 NOASSERTION
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
by waditu python
11075 BSD-3-Clause
TuShare is a utility for crawling historical data of China stocks
by dask python
9771 BSD-3-Clause
Parallel computing with task scheduling
Trending New libraries in Pandas
by pola-rs rust
5341 MIT
Fast multi-threaded DataFrame library in Rust | Python | Node.js
by lux-org python
3417 Apache-2.0
Automatically visualize your pandas dataframe via a single print! 📊 💡
by evidentlyai python
2268 Apache-2.0
Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b
by fbdesignpro python
1597 MIT
Visualize and compare datasets, target values and associations, with one line of code.
by alexgolec python
958 MIT
A TD Ameritrade API client for Python. Includes historical data for equities and ETFs, options chains, streaming order book data, complex order construction, and more.
by kotartemiy python
933 MIT
If Google News had a Python library
by theOehrly python
764 MIT
FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry
by capitalone python
747 Apache-2.0
What's in your data? Extract schema, statistics and entities from datasets
by 8080labs python
663 MIT
Predictive Power Score (PPS) in Python
Top Authors in Pandas
1
36 Libraries
447
2
17 Libraries
823
3
7 Libraries
110
4
6 Libraries
1389
5
6 Libraries
74
6
6 Libraries
2438
7
6 Libraries
66
8
6 Libraries
27
9
6 Libraries
24
10
5 Libraries
466
1
36 Libraries
447
2
17 Libraries
823
3
7 Libraries
110
4
6 Libraries
1389
5
6 Libraries
74
6
6 Libraries
2438
7
6 Libraries
66
8
6 Libraries
27
9
6 Libraries
24
10
5 Libraries
466
Trending Kits in Pandas
No Trending Kits are available at this moment for Pandas
Trending Discussions on Pandas
Installing scipy and scikit-learn on apple m1
Error while downloading the requirements using pip install (setup command: use_2to3 is invalid.)
Mapping complex JSON to Pandas Dataframe
AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks'>
How to update pandas DataFrame.drop() for Future Warning - all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
Cannot set up a conda environment with python 3.10
ImportError: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic'
Merge two pandas DataFrame based on partial match
Create a new column in a Pandas DataFrame from existing column names
After conda update, python kernel crashes when matplotlib is used
QUESTION
Installing scipy and scikit-learn on apple m1
Asked 2022-Mar-22 at 06:21The installation on the m1 chip for the following packages: Numpy 1.21.1, pandas 1.3.0, torch 1.9.0 and a few other ones works fine for me. They also seem to work properly while testing them. However when I try to install scipy or scikit-learn via pip this error appears:
ERROR: Failed building wheel for numpy
Failed to build numpy
ERROR: Could not build wheels for numpy which use PEP 517 and cannot be installed directly
Why should Numpy be build again when I have the latest version from pip already installed?
Every previous installation was done using python3.9 -m pip install ...
on Mac OS 11.3.1 with the apple m1 chip.
Maybe somebody knows how to deal with this error or if its just a matter of time.
ANSWER
Answered 2021-Aug-02 at 14:33Please see this note of scikit-learn
about
Installing on Apple Silicon M1 hardware
The recently introduced
macos/arm64
platform (sometimes also known asmacos/aarch64
) requires the open source community to upgrade the build configuation and automation to properly support it.At the time of writing (January 2021), the only way to get a working installation of scikit-learn on this hardware is to install scikit-learn and its dependencies from the conda-forge distribution, for instance using the miniforge installers:
https://github.com/conda-forge/miniforge
The following issue tracks progress on making it possible to install scikit-learn from PyPI with pip:
QUESTION
Error while downloading the requirements using pip install (setup command: use_2to3 is invalid.)
Asked 2022-Mar-05 at 07:13version pip 21.2.4 python 3.6
The command:
1pip install -r requirments.txt
2
The content of my requirements.txt
:
1pip install -r requirments.txt
2mongoengine==0.19.1
3numpy==1.16.2
4pylint
5pandas==1.1.5
6fawkes
7
The command is failing with this error
1pip install -r requirments.txt
2mongoengine==0.19.1
3numpy==1.16.2
4pylint
5pandas==1.1.5
6fawkes
7ERROR: Command errored out with exit status 1:
8 command: /Users/*/Desktop/ml/*/venv/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/kn/0y92g7x55qs7c42tln4gwhtm0000gp/T/pip-install-soh30mel/mongoengine_89e68f8427244f1bb3215b22f77a619c/setup.py'"'"'; __file__='"'"'/private/var/folders/kn/0y92g7x55qs7c42tln4gwhtm0000gp/T/pip-install-soh30mel/mongoengine_89e68f8427244f1bb3215b22f77a619c/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/kn/0y92g7x55qs7c42tln4gwhtm0000gp/T/pip-pip-egg-info-97994d6e
9 cwd: /private/var/folders/kn/0y92g7x55qs7c42tln4gwhtm0000gp/T/pip-install-soh30mel/mongoengine_89e68f8427244f1bb3215b22f77a619c/
10 Complete output (1 lines):
11 error in mongoengine setup command: use_2to3 is invalid.
12 ----------------------------------------
13WARNING: Discarding https://*/pypi/packages/mongoengine-0.19.1.tar.gz#md5=68e613009f6466239158821a102ac084 (from https://*/pypi/simple/mongoengine/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
14ERROR: Could not find a version that satisfies the requirement mongoengine==0.19.1 (from versions: 0.15.0, 0.19.1)
15ERROR: No matching distribution found for mongoengine==0.19.1
16
ANSWER
Answered 2021-Nov-19 at 13:30It looks like setuptools>=58
breaks support for use_2to3
:
So you should update setuptools
to setuptools<58
or avoid using packages with use_2to3
in the setup parameters.
I was having the same problem, pip==19.3.1
QUESTION
Mapping complex JSON to Pandas Dataframe
Asked 2022-Feb-25 at 13:57Background
I have a complex nested JSON object, which I am trying to unpack into a pandas df
in a very specific way.
JSON Object
this is an extract, containing randomized data of the JSON object, which shows examples of the hierarchy (inc. children) for 1x family (i.e. 'Falconer Family'), however there is 100s of them in total and this extract just has 1x family, however the full JSON object has multiple -
1{
2 "meta": {
3 "columns": [{
4 "key": "value",
5 "display_name": "Adjusted Value (No Div, USD)",
6 "output_type": "Number",
7 "currency": "USD"
8 },
9 {
10 "key": "time_weighted_return",
11 "display_name": "Current Quarter TWR (USD)",
12 "output_type": "Percent",
13 "currency": "USD"
14 },
15 {
16 "key": "time_weighted_return_2",
17 "display_name": "YTD TWR (USD)",
18 "output_type": "Percent",
19 "currency": "USD"
20 },
21 {
22 "key": "_custom_twr_audit_note_911328",
23 "display_name": "TWR Audit Note",
24 "output_type": "Word"
25 }
26 ],
27 "groupings": [{
28 "key": "_custom_name_747205",
29 "display_name": "* Reporting Client Name"
30 },
31 {
32 "key": "_custom_new_entity_group_453577",
33 "display_name": "NEW Entity Group"
34 },
35 {
36 "key": "_custom_level_2_624287",
37 "display_name": "* Level 2"
38 },
39 {
40 "key": "legal_entity",
41 "display_name": "Legal Entity"
42 }
43 ]
44 },
45 "data": {
46 "type": "portfolio_views",
47 "attributes": {
48 "total": {
49 "name": "Total",
50 "columns": {
51 "time_weighted_return": -0.046732301295604683,
52 "time_weighted_return_2": -0.046732301295604683,
53 "_custom_twr_audit_note_911328": null,
54 "value": 23132492.905107163
55 },
56 "children": [{
57 "name": "Falconer Family",
58 "grouping": "_custom_name_747205",
59 "columns": {
60 "time_weighted_return": -0.046732301295604683,
61 "time_weighted_return_2": -0.046732301295604683,
62 "_custom_twr_audit_note_911328": null,
63 "value": 23132492.905107163
64 },
65 "children": [{
66 "name": "Wealth Bucket A",
67 "grouping": "_custom_new_entity_group_453577",
68 "columns": {
69 "time_weighted_return": -0.045960317420568164,
70 "time_weighted_return_2": -0.045960317420568164,
71 "_custom_twr_audit_note_911328": null,
72 "value": 13264448.506587159
73 },
74 "children": [{
75 "name": "Asset Class A",
76 "grouping": "_custom_level_2_624287",
77 "columns": {
78 "time_weighted_return": 0.000003434094574039648,
79 "time_weighted_return_2": 0.000003434094574039648,
80 "_custom_twr_audit_note_911328": null,
81 "value": 3337.99
82 },
83 "children": [{
84 "entity_id": 10604454,
85 "name": "HUDJ Trust",
86 "grouping": "legal_entity",
87 "columns": {
88 "time_weighted_return": 0.000003434094574039648,
89 "time_weighted_return_2": 0.000003434094574039648,
90 "_custom_twr_audit_note_911328": null,
91 "value": 3337.99
92 },
93 "children": []
94 }]
95 },
96 {
97 "name": "Asset Class B",
98 "grouping": "_custom_level_2_624287",
99 "columns": {
100 "time_weighted_return": -0.025871339096964152,
101 "time_weighted_return_2": -0.025871339096964152,
102 "_custom_twr_audit_note_911328": null,
103 "value": 1017004.7192636987
104 },
105 "children": [{
106 "entity_id": 10604454,
107 "name": "HUDG Trust",
108 "grouping": "legal_entity",
109 "columns": {
110 "time_weighted_return": -0.025871339096964152,
111 "time_weighted_return_2": -0.025871339096964152,
112 "_custom_twr_audit_note_911328": null,
113 "value": 1017004.7192636987
114 },
115 "children": []
116 }]
117 },
118 {
119 "name": "Asset Class C",
120 "grouping": "_custom_level_2_624287",
121 "columns": {
122 "time_weighted_return": -0.030370376329670656,
123 "time_weighted_return_2": -0.030370376329670656,
124 "_custom_twr_audit_note_911328": null,
125 "value": 231142.67772000004
126 },
127 "children": [{
128 "entity_id": 10604454,
129 "name": "HKDJ Trust",
130 "grouping": "legal_entity",
131 "columns": {
132 "time_weighted_return": -0.030370376329670656,
133 "time_weighted_return_2": -0.030370376329670656,
134 "_custom_twr_audit_note_911328": null,
135 "value": 231142.67772000004
136 },
137 "children": []
138 }]
139 },
140 {
141 "name": "Asset Class D",
142 "grouping": "_custom_level_2_624287",
143 "columns": {
144 "time_weighted_return": -0.05382756475465478,
145 "time_weighted_return_2": -0.05382756475465478,
146 "_custom_twr_audit_note_911328": null,
147 "value": 9791282.570000006
148 },
149 "children": [{
150 "entity_id": 10604454,
151 "name": "HUDW Trust",
152 "grouping": "legal_entity",
153 "columns": {
154 "time_weighted_return": -0.05382756475465478,
155 "time_weighted_return_2": -0.05382756475465478,
156 "_custom_twr_audit_note_911328": null,
157 "value": 9791282.570000006
158 },
159 "children": []
160 }]
161 },
162 {
163 "name": "Asset Class E",
164 "grouping": "_custom_level_2_624287",
165 "columns": {
166 "time_weighted_return": -0.01351630404081805,
167 "time_weighted_return_2": -0.01351630404081805,
168 "_custom_twr_audit_note_911328": null,
169 "value": 2153366.6396034593
170 },
171 "children": [{
172 "entity_id": 10604454,
173 "name": "HJDJ Trust",
174 "grouping": "legal_entity",
175 "columns": {
176 "time_weighted_return": -0.01351630404081805,
177 "time_weighted_return_2": -0.01351630404081805,
178 "_custom_twr_audit_note_911328": null,
179 "value": 2153366.6396034593
180 },
181 "children": []
182 }]
183 },
184 {
185 "name": "Asset Class F",
186 "grouping": "_custom_level_2_624287",
187 "columns": {
188 "time_weighted_return": -0.002298190175237247,
189 "time_weighted_return_2": -0.002298190175237247,
190 "_custom_twr_audit_note_911328": null,
191 "value": 68313.90999999999
192 },
193 "children": [{
194 "entity_id": 10604454,
195 "name": "HADJ Trust",
196 "grouping": "legal_entity",
197 "columns": {
198 "time_weighted_return": -0.002298190175237247,
199 "time_weighted_return_2": -0.002298190175237247,
200 "_custom_twr_audit_note_911328": null,
201 "value": 68313.90999999999
202 },
203 "children": []
204 }]
205 }
206 ]
207 },
208 {
209 "name": "Wealth Bucket B",
210 "grouping": "_custom_new_entity_group_453577",
211 "columns": {
212 "time_weighted_return": -0.04769870075659244,
213 "time_weighted_return_2": -0.04769870075659244,
214 "_custom_twr_audit_note_911328": null,
215 "value": 9868044.398519998
216 },
217 "children": [{
218 "name": "Asset Class A",
219 "grouping": "_custom_level_2_624287",
220 "columns": {
221 "time_weighted_return": 0.000028632718065191298,
222 "time_weighted_return_2": 0.000028632718065191298,
223 "_custom_twr_audit_note_911328": null,
224 "value": 10234.94
225 },
226 "children": [{
227 "entity_id": 10868778,
228 "name": "2012 Desc Tr HBO Thalia",
229 "grouping": "legal_entity",
230 "columns": {
231 "time_weighted_return": 0.0000282679297198829,
232 "time_weighted_return_2": 0.0000282679297198829,
233 "_custom_twr_audit_note_911328": null,
234 "value": 244.28
235 },
236 "children": []
237 },
238 {
239 "entity_id": 10643052,
240 "name": "2013 Irrev Tr HBO Thalia",
241 "grouping": "legal_entity",
242 "columns": {
243 "time_weighted_return": 0.000049373572795108345,
244 "time_weighted_return_2": 0.000049373572795108345,
245 "_custom_twr_audit_note_911328": null,
246 "value": 5081.08
247 },
248 "children": []
249 },
250 {
251 "entity_id": 10598341,
252 "name": "Cht 11th Tr HBO Shirley",
253 "grouping": "legal_entity",
254 "columns": {
255 "time_weighted_return": 0.000006609603754315074,
256 "time_weighted_return_2": 0.000006609603754315074,
257 "_custom_twr_audit_note_911328": null,
258 "value": 1523.62
259 },
260 "children": []
261 },
262 {
263 "entity_id": 10598337,
264 "name": "Cht 11th Tr HBO Hannah",
265 "grouping": "legal_entity",
266 "columns": {
267 "time_weighted_return": 0.000010999769004760296,
268 "time_weighted_return_2": 0.000010999769004760296,
269 "_custom_twr_audit_note_911328": null,
270 "value": 1828.9
271 },
272 "children": []
273 },
274 {
275 "entity_id": 10598334,
276 "name": "Cht 11th Tr HBO Lau",
277 "grouping": "legal_entity",
278 "columns": {
279 "time_weighted_return": 0.000006466673995619843,
280 "time_weighted_return_2": 0.000006466673995619843,
281 "_custom_twr_audit_note_911328": null,
282 "value": 1557.06
283 },
284 "children": []
285 }
286 ]
287 },
288 {
289 "name": "Asset Class B",
290 "grouping": "_custom_level_2_624287",
291 "columns": {
292 "time_weighted_return": -0.024645947842438676,
293 "time_weighted_return_2": -0.024645947842438676,
294 "_custom_twr_audit_note_911328": null,
295 "value": 674052.31962
296 },
297 "children": [{
298 "entity_id": 10868778,
299 "name": "2012 Desc Tr HBO Thalia",
300 "grouping": "legal_entity",
301 "columns": {
302 "time_weighted_return": -0.043304004172576405,
303 "time_weighted_return_2": -0.043304004172576405,
304 "_custom_twr_audit_note_911328": null,
305 "value": 52800.96
306 },
307 "children": []
308 },
309 {
310 "entity_id": 10643052,
311 "name": "2013 Irrev Tr HBO Thalia",
312 "grouping": "legal_entity",
313 "columns": {
314 "time_weighted_return": -0.022408434778798836,
315 "time_weighted_return_2": -0.022408434778798836,
316 "_custom_twr_audit_note_911328": null,
317 "value": 599594.11962
318 },
319 "children": []
320 },
321 {
322 "entity_id": 10598341,
323 "name": "Cht 11th Tr HBO Shirley",
324 "grouping": "legal_entity",
325 "columns": {
326 "time_weighted_return": -0.039799855483646174,
327 "time_weighted_return_2": -0.039799855483646174,
328 "_custom_twr_audit_note_911328": null,
329 "value": 7219.08
330 },
331 "children": []
332 },
333 {
334 "entity_id": 10598337,
335 "name": "Cht 11th Tr HBO Hannah",
336 "grouping": "legal_entity",
337 "columns": {
338 "time_weighted_return": -0.039799855483646174,
339 "time_weighted_return_2": -0.039799855483646174,
340 "_custom_twr_audit_note_911328": null,
341 "value": 7219.08
342 },
343 "children": []
344 },
345 {
346 "entity_id": 10598334,
347 "name": "Cht 11th Tr HBO Lau",
348 "grouping": "legal_entity",
349 "columns": {
350 "time_weighted_return": -0.039799855483646174,
351 "time_weighted_return_2": -0.039799855483646174,
352 "_custom_twr_audit_note_911328": null,
353 "value": 7219.08
354 },
355 "children": []
356 }
357 ]
358 },
359 {
360 "name": "Asset Class C",
361 "grouping": "_custom_level_2_624287",
362 "columns": {
363 "time_weighted_return": -0.03037038746301135,
364 "time_weighted_return_2": -0.03037038746301135,
365 "_custom_twr_audit_note_911328": null,
366 "value": 114472.69744
367 },
368 "children": [{
369 "entity_id": 10868778,
370 "name": "2012 Desc Tr HBO Thalia",
371 "grouping": "legal_entity",
372 "columns": {
373 "time_weighted_return": -0.030370390035505124,
374 "time_weighted_return_2": -0.030370390035505124,
375 "_custom_twr_audit_note_911328": null,
376 "value": 114472.68744000001
377 },
378 "children": []
379 },
380 {
381 "entity_id": 10643052,
382 "name": "2013 Irrev Tr HBO Thalia",
383 "grouping": "legal_entity",
384 "columns": {
385 "time_weighted_return": 0,
386 "time_weighted_return_2": 0,
387 "_custom_twr_audit_note_911328": null,
388 "value": 0.01
389 },
390 "children": []
391 }
392 ]
393 },
394 {
395 "name": "Asset Class D",
396 "grouping": "_custom_level_2_624287",
397 "columns": {
398 "time_weighted_return": -0.06604362523792162,
399 "time_weighted_return_2": -0.06604362523792162,
400 "_custom_twr_audit_note_911328": null,
401 "value": 5722529.229999997
402 },
403 "children": [{
404 "entity_id": 10868778,
405 "name": "2012 Desc Tr HBO Thalia",
406 "grouping": "legal_entity",
407 "columns": {
408 "time_weighted_return": -0.06154960593668424,
409 "time_weighted_return_2": -0.06154960593668424,
410 "_custom_twr_audit_note_911328": null,
411 "value": 1191838.9399999995
412 },
413 "children": []
414 },
415 {
416 "entity_id": 10643052,
417 "name": "2013 Irrev Tr HBO Thalia",
418 "grouping": "legal_entity",
419 "columns": {
420 "time_weighted_return": -0.06750460387418267,
421 "time_weighted_return_2": -0.06750460387418267,
422 "_custom_twr_audit_note_911328": null,
423 "value": 4416618.520000002
424 },
425 "children": []
426 },
427 {
428 "entity_id": 10598341,
429 "name": "Cht 11th Tr HBO Shirley",
430 "grouping": "legal_entity",
431 "columns": {
432 "time_weighted_return": -0.05604507809250081,
433 "time_weighted_return_2": -0.05604507809250081,
434 "_custom_twr_audit_note_911328": null,
435 "value": 38190.33
436 },
437 "children": []
438 },
439 {
440 "entity_id": 10598337,
441 "name": "Cht 11th Tr HBO Hannah",
442 "grouping": "legal_entity",
443 "columns": {
444 "time_weighted_return": -0.05604507809250081,
445 "time_weighted_return_2": -0.05604507809250081,
446 "_custom_twr_audit_note_911328": null,
447 "value": 37940.72
448 },
449 "children": []
450 },
451 {
452 "entity_id": 10598334,
453 "name": "Cht 11th Tr HBO Lau",
454 "grouping": "legal_entity",
455 "columns": {
456 "time_weighted_return": -0.05604507809250081,
457 "time_weighted_return_2": -0.05604507809250081,
458 "_custom_twr_audit_note_911328": null,
459 "value": 37940.72
460 },
461 "children": []
462 }
463 ]
464 },
465 {
466 "name": "Asset Class E",
467 "grouping": "_custom_level_2_624287",
468 "columns": {
469 "time_weighted_return": -0.017118805423322003,
470 "time_weighted_return_2": -0.017118805423322003,
471 "_custom_twr_audit_note_911328": null,
472 "value": 3148495.0914600003
473 },
474 "children": [{
475 "entity_id": 10868778,
476 "name": "2012 Desc Tr HBO Thalia",
477 "grouping": "legal_entity",
478 "columns": {
479 "time_weighted_return": -0.015251157805867277,
480 "time_weighted_return_2": -0.015251157805867277,
481 "_custom_twr_audit_note_911328": null,
482 "value": 800493.06146
483 },
484 "children": []
485 },
486 {
487 "entity_id": 10643052,
488 "name": "2013 Irrev Tr HBO Thalia",
489 "grouping": "legal_entity",
490 "columns": {
491 "time_weighted_return": -0.01739609576880241,
492 "time_weighted_return_2": -0.01739609576880241,
493 "_custom_twr_audit_note_911328": null,
494 "value": 2215511.2700000005
495 },
496 "children": []
497 },
498 {
499 "entity_id": 10598341,
500 "name": "Cht 11th Tr HBO Shirley",
501 "grouping": "legal_entity",
502 "columns": {
503 "time_weighted_return": -0.02085132265594647,
504 "time_weighted_return_2": -0.02085132265594647,
505 "_custom_twr_audit_note_911328": null,
506 "value": 44031.21
507 },
508 "children": []
509 },
510 {
511 "entity_id": 10598337,
512 "name": "Cht 11th Tr HBO Hannah",
513 "grouping": "legal_entity",
514 "columns": {
515 "time_weighted_return": -0.02089393244695803,
516 "time_weighted_return_2": -0.02089393244695803,
517 "_custom_twr_audit_note_911328": null,
518 "value": 44394.159999999996
519 },
520 "children": []
521 },
522 {
523 "entity_id": 10598334,
524 "name": "Cht 11th Tr HBO Lau",
525 "grouping": "legal_entity",
526 "columns": {
527 "time_weighted_return": -0.020607507059866248,
528 "time_weighted_return_2": -0.020607507059866248,
529 "_custom_twr_audit_note_911328": null,
530 "value": 44065.39000000001
531 },
532 "children": []
533 }
534 ]
535 },
536 {
537 "name": "Asset Class F",
538 "grouping": "_custom_level_2_624287",
539 "columns": {
540 "time_weighted_return": -0.0014710489231547497,
541 "time_weighted_return_2": -0.0014710489231547497,
542 "_custom_twr_audit_note_911328": null,
543 "value": 198260.12
544 },
545 "children": [{
546 "entity_id": 10868778,
547 "name": "2012 Desc Tr HBO Thalia",
548 "grouping": "legal_entity",
549 "columns": {
550 "time_weighted_return": -0.0014477244560456848,
551 "time_weighted_return_2": -0.0014477244560456848,
552 "_custom_twr_audit_note_911328": null,
553 "value": 44612.33
554 },
555 "children": []
556 },
557 {
558 "entity_id": 10643052,
559 "name": "2013 Irrev Tr HBO Thalia",
560 "grouping": "legal_entity",
561 "columns": {
562 "time_weighted_return": -0.001477821083437858,
563 "time_weighted_return_2": -0.001477821083437858,
564 "_custom_twr_audit_note_911328": null,
565 "value": 153647.78999999998
566 },
567 "children": []
568 }
569 ]
570 }
571 ]
572 }
573 ]
574 }]
575 }
576 },
577 "included": []
578 }
579}
580
Notes on JSON Object extract
data
- data in here can be ignored, these are aggregated values for underlying children.meta
-columns
– contains the column header values I want to use for each applicablechildren
‘column` key:pair values.groupings
- can be ignored.children
hierarchy – there are 4x levels ofchildren
which can be identified by theirname
as follows –- Family
name
(i.e., ‘Falconer Family’) - Wealth Bucket
name
(e.g., ‘Wealth Bucket A’) - Asset Class
name
(e.g., ‘Asset Class A’) - Fund
name
(e.g., ‘HUDJ Trust’)
- Family
Target Output
this is an extract of target df
structure I am trying to achieve -
portfolio | name | entity_id | Adjusted Value (No Div, USD) | Current Quarter TWR (USD) | YTD TWR (USD) | TWR Audit Note |
---|---|---|---|---|---|---|
Falconer Family | Falconer Family | 23132492.90510712 | -0.046732301295604683 | -0.046732301295604683 | None | |
Falconer Family | Wealth Bucket A | 13264448.506587146 | -0.045960317420568164 | -0.045960317420568164 | None | |
Falconer Family | Asset Class A | 3337.99 | 0.000003434094574039648 | 0.000003434094574039648 | None | |
Falconer Family | HUDJ Trust | 10604454 | 3337.99 | 0.000003434094574039648 | 0.000003434094574039648 | None |
Falconer Family | Asset Class B | 1017004.7192636987 | -0.025871339096964152 | -0.025871339096964152 | None | |
Falconer Family | HUDG Trust | 10604454 | 1017004.7192636987 | -0.025871339096964152 | -0.025871339096964152 | None |
Falconer Family | Asset Class C | 231142.67772000004 | -0.030370376329670656 | -0.030370376329670656 | None | |
Falconer Family | HKDJ Trust | 10604454 | 231142.67772000004 | -0.030370376329670656 | -0.030370376329670656 | None |
Falconer Family | Asset Class D | 9791282.570000006 | -0.05382756475465478 | -0.05382756475465478 | None | |
Falconer Family | HUDW Trust | 10604454 | 9791282.570000006 | -0.05382756475465478 | -0.05382756475465478 | None |
Notes on Target Output
- Portfolio header – for every row, I would like to map the top-level
children
name
value [family name]. E.g., ‘Falconer Family. - Name header – this should simply be the
name
value from each respectivechildren
. - Entity ID – all 4th level
children
entity_id
value should be mapped to this column. - Data columns – regardless of level, all
children
have identicaltime_weighted_return
,time-weighted_return2
andvalue
columns which should be mapped respectively. - TWR Audit Note – these
children
_custom_twr_audit_note_911318
values are currently blank, but will be utilized in the future.
Current Output
My main issue is that you can see that I have only been able to tap into the 1st [Family] and 2nd [Wealth Bucket] children
level. This leaves me missing the 3rd [Asset Class] and 4th [Fund] -
portfolio | name | Adjusted Value (No Div, USD) | Current Quarter TWR (USD) | YTD TWR (USD) | TWR Audit Note) | |
---|---|---|---|---|---|---|
0 | Falconer Family | Falconer Family | 2.313249e+07 | -0.046732 | -0.046732 | None |
1 | Falconer Family | Wealth Bucket A | 1.326445e+07 | -0.045960 | -0.045960 | None |
2 | Falconer Family | Wealth Bucket B | 9.868044e+06 | -0.047699 | -0.047699 | None |
Current code
This is a function which gets me the correct df
formatting, however my main issue is that I haven't been able to find a solution to returning all children, but rather only the top-level -
1{
2 "meta": {
3 "columns": [{
4 "key": "value",
5 "display_name": "Adjusted Value (No Div, USD)",
6 "output_type": "Number",
7 "currency": "USD"
8 },
9 {
10 "key": "time_weighted_return",
11 "display_name": "Current Quarter TWR (USD)",
12 "output_type": "Percent",
13 "currency": "USD"
14 },
15 {
16 "key": "time_weighted_return_2",
17 "display_name": "YTD TWR (USD)",
18 "output_type": "Percent",
19 "currency": "USD"
20 },
21 {
22 "key": "_custom_twr_audit_note_911328",
23 "display_name": "TWR Audit Note",
24 "output_type": "Word"
25 }
26 ],
27 "groupings": [{
28 "key": "_custom_name_747205",
29 "display_name": "* Reporting Client Name"
30 },
31 {
32 "key": "_custom_new_entity_group_453577",
33 "display_name": "NEW Entity Group"
34 },
35 {
36 "key": "_custom_level_2_624287",
37 "display_name": "* Level 2"
38 },
39 {
40 "key": "legal_entity",
41 "display_name": "Legal Entity"
42 }
43 ]
44 },
45 "data": {
46 "type": "portfolio_views",
47 "attributes": {
48 "total": {
49 "name": "Total",
50 "columns": {
51 "time_weighted_return": -0.046732301295604683,
52 "time_weighted_return_2": -0.046732301295604683,
53 "_custom_twr_audit_note_911328": null,
54 "value": 23132492.905107163
55 },
56 "children": [{
57 "name": "Falconer Family",
58 "grouping": "_custom_name_747205",
59 "columns": {
60 "time_weighted_return": -0.046732301295604683,
61 "time_weighted_return_2": -0.046732301295604683,
62 "_custom_twr_audit_note_911328": null,
63 "value": 23132492.905107163
64 },
65 "children": [{
66 "name": "Wealth Bucket A",
67 "grouping": "_custom_new_entity_group_453577",
68 "columns": {
69 "time_weighted_return": -0.045960317420568164,
70 "time_weighted_return_2": -0.045960317420568164,
71 "_custom_twr_audit_note_911328": null,
72 "value": 13264448.506587159
73 },
74 "children": [{
75 "name": "Asset Class A",
76 "grouping": "_custom_level_2_624287",
77 "columns": {
78 "time_weighted_return": 0.000003434094574039648,
79 "time_weighted_return_2": 0.000003434094574039648,
80 "_custom_twr_audit_note_911328": null,
81 "value": 3337.99
82 },
83 "children": [{
84 "entity_id": 10604454,
85 "name": "HUDJ Trust",
86 "grouping": "legal_entity",
87 "columns": {
88 "time_weighted_return": 0.000003434094574039648,
89 "time_weighted_return_2": 0.000003434094574039648,
90 "_custom_twr_audit_note_911328": null,
91 "value": 3337.99
92 },
93 "children": []
94 }]
95 },
96 {
97 "name": "Asset Class B",
98 "grouping": "_custom_level_2_624287",
99 "columns": {
100 "time_weighted_return": -0.025871339096964152,
101 "time_weighted_return_2": -0.025871339096964152,
102 "_custom_twr_audit_note_911328": null,
103 "value": 1017004.7192636987
104 },
105 "children": [{
106 "entity_id": 10604454,
107 "name": "HUDG Trust",
108 "grouping": "legal_entity",
109 "columns": {
110 "time_weighted_return": -0.025871339096964152,
111 "time_weighted_return_2": -0.025871339096964152,
112 "_custom_twr_audit_note_911328": null,
113 "value": 1017004.7192636987
114 },
115 "children": []
116 }]
117 },
118 {
119 "name": "Asset Class C",
120 "grouping": "_custom_level_2_624287",
121 "columns": {
122 "time_weighted_return": -0.030370376329670656,
123 "time_weighted_return_2": -0.030370376329670656,
124 "_custom_twr_audit_note_911328": null,
125 "value": 231142.67772000004
126 },
127 "children": [{
128 "entity_id": 10604454,
129 "name": "HKDJ Trust",
130 "grouping": "legal_entity",
131 "columns": {
132 "time_weighted_return": -0.030370376329670656,
133 "time_weighted_return_2": -0.030370376329670656,
134 "_custom_twr_audit_note_911328": null,
135 "value": 231142.67772000004
136 },
137 "children": []
138 }]
139 },
140 {
141 "name": "Asset Class D",
142 "grouping": "_custom_level_2_624287",
143 "columns": {
144 "time_weighted_return": -0.05382756475465478,
145 "time_weighted_return_2": -0.05382756475465478,
146 "_custom_twr_audit_note_911328": null,
147 "value": 9791282.570000006
148 },
149 "children": [{
150 "entity_id": 10604454,
151 "name": "HUDW Trust",
152 "grouping": "legal_entity",
153 "columns": {
154 "time_weighted_return": -0.05382756475465478,
155 "time_weighted_return_2": -0.05382756475465478,
156 "_custom_twr_audit_note_911328": null,
157 "value": 9791282.570000006
158 },
159 "children": []
160 }]
161 },
162 {
163 "name": "Asset Class E",
164 "grouping": "_custom_level_2_624287",
165 "columns": {
166 "time_weighted_return": -0.01351630404081805,
167 "time_weighted_return_2": -0.01351630404081805,
168 "_custom_twr_audit_note_911328": null,
169 "value": 2153366.6396034593
170 },
171 "children": [{
172 "entity_id": 10604454,
173 "name": "HJDJ Trust",
174 "grouping": "legal_entity",
175 "columns": {
176 "time_weighted_return": -0.01351630404081805,
177 "time_weighted_return_2": -0.01351630404081805,
178 "_custom_twr_audit_note_911328": null,
179 "value": 2153366.6396034593
180 },
181 "children": []
182 }]
183 },
184 {
185 "name": "Asset Class F",
186 "grouping": "_custom_level_2_624287",
187 "columns": {
188 "time_weighted_return": -0.002298190175237247,
189 "time_weighted_return_2": -0.002298190175237247,
190 "_custom_twr_audit_note_911328": null,
191 "value": 68313.90999999999
192 },
193 "children": [{
194 "entity_id": 10604454,
195 "name": "HADJ Trust",
196 "grouping": "legal_entity",
197 "columns": {
198 "time_weighted_return": -0.002298190175237247,
199 "time_weighted_return_2": -0.002298190175237247,
200 "_custom_twr_audit_note_911328": null,
201 "value": 68313.90999999999
202 },
203 "children": []
204 }]
205 }
206 ]
207 },
208 {
209 "name": "Wealth Bucket B",
210 "grouping": "_custom_new_entity_group_453577",
211 "columns": {
212 "time_weighted_return": -0.04769870075659244,
213 "time_weighted_return_2": -0.04769870075659244,
214 "_custom_twr_audit_note_911328": null,
215 "value": 9868044.398519998
216 },
217 "children": [{
218 "name": "Asset Class A",
219 "grouping": "_custom_level_2_624287",
220 "columns": {
221 "time_weighted_return": 0.000028632718065191298,
222 "time_weighted_return_2": 0.000028632718065191298,
223 "_custom_twr_audit_note_911328": null,
224 "value": 10234.94
225 },
226 "children": [{
227 "entity_id": 10868778,
228 "name": "2012 Desc Tr HBO Thalia",
229 "grouping": "legal_entity",
230 "columns": {
231 "time_weighted_return": 0.0000282679297198829,
232 "time_weighted_return_2": 0.0000282679297198829,
233 "_custom_twr_audit_note_911328": null,
234 "value": 244.28
235 },
236 "children": []
237 },
238 {
239 "entity_id": 10643052,
240 "name": "2013 Irrev Tr HBO Thalia",
241 "grouping": "legal_entity",
242 "columns": {
243 "time_weighted_return": 0.000049373572795108345,
244 "time_weighted_return_2": 0.000049373572795108345,
245 "_custom_twr_audit_note_911328": null,
246 "value": 5081.08
247 },
248 "children": []
249 },
250 {
251 "entity_id": 10598341,
252 "name": "Cht 11th Tr HBO Shirley",
253 "grouping": "legal_entity",
254 "columns": {
255 "time_weighted_return": 0.000006609603754315074,
256 "time_weighted_return_2": 0.000006609603754315074,
257 "_custom_twr_audit_note_911328": null,
258 "value": 1523.62
259 },
260 "children": []
261 },
262 {
263 "entity_id": 10598337,
264 "name": "Cht 11th Tr HBO Hannah",
265 "grouping": "legal_entity",
266 "columns": {
267 "time_weighted_return": 0.000010999769004760296,
268 "time_weighted_return_2": 0.000010999769004760296,
269 "_custom_twr_audit_note_911328": null,
270 "value": 1828.9
271 },
272 "children": []
273 },
274 {
275 "entity_id": 10598334,
276 "name": "Cht 11th Tr HBO Lau",
277 "grouping": "legal_entity",
278 "columns": {
279 "time_weighted_return": 0.000006466673995619843,
280 "time_weighted_return_2": 0.000006466673995619843,
281 "_custom_twr_audit_note_911328": null,
282 "value": 1557.06
283 },
284 "children": []
285 }
286 ]
287 },
288 {
289 "name": "Asset Class B",
290 "grouping": "_custom_level_2_624287",
291 "columns": {
292 "time_weighted_return": -0.024645947842438676,
293 "time_weighted_return_2": -0.024645947842438676,
294 "_custom_twr_audit_note_911328": null,
295 "value": 674052.31962
296 },
297 "children": [{
298 "entity_id": 10868778,
299 "name": "2012 Desc Tr HBO Thalia",
300 "grouping": "legal_entity",
301 "columns": {
302 "time_weighted_return": -0.043304004172576405,
303 "time_weighted_return_2": -0.043304004172576405,
304 "_custom_twr_audit_note_911328": null,
305 "value": 52800.96
306 },
307 "children": []
308 },
309 {
310 "entity_id": 10643052,
311 "name": "2013 Irrev Tr HBO Thalia",
312 "grouping": "legal_entity",
313 "columns": {
314 "time_weighted_return": -0.022408434778798836,
315 "time_weighted_return_2": -0.022408434778798836,
316 "_custom_twr_audit_note_911328": null,
317 "value": 599594.11962
318 },
319 "children": []
320 },
321 {
322 "entity_id": 10598341,
323 "name": "Cht 11th Tr HBO Shirley",
324 "grouping": "legal_entity",
325 "columns": {
326 "time_weighted_return": -0.039799855483646174,
327 "time_weighted_return_2": -0.039799855483646174,
328 "_custom_twr_audit_note_911328": null,
329 "value": 7219.08
330 },
331 "children": []
332 },
333 {
334 "entity_id": 10598337,
335 "name": "Cht 11th Tr HBO Hannah",
336 "grouping": "legal_entity",
337 "columns": {
338 "time_weighted_return": -0.039799855483646174,
339 "time_weighted_return_2": -0.039799855483646174,
340 "_custom_twr_audit_note_911328": null,
341 "value": 7219.08
342 },
343 "children": []
344 },
345 {
346 "entity_id": 10598334,
347 "name": "Cht 11th Tr HBO Lau",
348 "grouping": "legal_entity",
349 "columns": {
350 "time_weighted_return": -0.039799855483646174,
351 "time_weighted_return_2": -0.039799855483646174,
352 "_custom_twr_audit_note_911328": null,
353 "value": 7219.08
354 },
355 "children": []
356 }
357 ]
358 },
359 {
360 "name": "Asset Class C",
361 "grouping": "_custom_level_2_624287",
362 "columns": {
363 "time_weighted_return": -0.03037038746301135,
364 "time_weighted_return_2": -0.03037038746301135,
365 "_custom_twr_audit_note_911328": null,
366 "value": 114472.69744
367 },
368 "children": [{
369 "entity_id": 10868778,
370 "name": "2012 Desc Tr HBO Thalia",
371 "grouping": "legal_entity",
372 "columns": {
373 "time_weighted_return": -0.030370390035505124,
374 "time_weighted_return_2": -0.030370390035505124,
375 "_custom_twr_audit_note_911328": null,
376 "value": 114472.68744000001
377 },
378 "children": []
379 },
380 {
381 "entity_id": 10643052,
382 "name": "2013 Irrev Tr HBO Thalia",
383 "grouping": "legal_entity",
384 "columns": {
385 "time_weighted_return": 0,
386 "time_weighted_return_2": 0,
387 "_custom_twr_audit_note_911328": null,
388 "value": 0.01
389 },
390 "children": []
391 }
392 ]
393 },
394 {
395 "name": "Asset Class D",
396 "grouping": "_custom_level_2_624287",
397 "columns": {
398 "time_weighted_return": -0.06604362523792162,
399 "time_weighted_return_2": -0.06604362523792162,
400 "_custom_twr_audit_note_911328": null,
401 "value": 5722529.229999997
402 },
403 "children": [{
404 "entity_id": 10868778,
405 "name": "2012 Desc Tr HBO Thalia",
406 "grouping": "legal_entity",
407 "columns": {
408 "time_weighted_return": -0.06154960593668424,
409 "time_weighted_return_2": -0.06154960593668424,
410 "_custom_twr_audit_note_911328": null,
411 "value": 1191838.9399999995
412 },
413 "children": []
414 },
415 {
416 "entity_id": 10643052,
417 "name": "2013 Irrev Tr HBO Thalia",
418 "grouping": "legal_entity",
419 "columns": {
420 "time_weighted_return": -0.06750460387418267,
421 "time_weighted_return_2": -0.06750460387418267,
422 "_custom_twr_audit_note_911328": null,
423 "value": 4416618.520000002
424 },
425 "children": []
426 },
427 {
428 "entity_id": 10598341,
429 "name": "Cht 11th Tr HBO Shirley",
430 "grouping": "legal_entity",
431 "columns": {
432 "time_weighted_return": -0.05604507809250081,
433 "time_weighted_return_2": -0.05604507809250081,
434 "_custom_twr_audit_note_911328": null,
435 "value": 38190.33
436 },
437 "children": []
438 },
439 {
440 "entity_id": 10598337,
441 "name": "Cht 11th Tr HBO Hannah",
442 "grouping": "legal_entity",
443 "columns": {
444 "time_weighted_return": -0.05604507809250081,
445 "time_weighted_return_2": -0.05604507809250081,
446 "_custom_twr_audit_note_911328": null,
447 "value": 37940.72
448 },
449 "children": []
450 },
451 {
452 "entity_id": 10598334,
453 "name": "Cht 11th Tr HBO Lau",
454 "grouping": "legal_entity",
455 "columns": {
456 "time_weighted_return": -0.05604507809250081,
457 "time_weighted_return_2": -0.05604507809250081,
458 "_custom_twr_audit_note_911328": null,
459 "value": 37940.72
460 },
461 "children": []
462 }
463 ]
464 },
465 {
466 "name": "Asset Class E",
467 "grouping": "_custom_level_2_624287",
468 "columns": {
469 "time_weighted_return": -0.017118805423322003,
470 "time_weighted_return_2": -0.017118805423322003,
471 "_custom_twr_audit_note_911328": null,
472 "value": 3148495.0914600003
473 },
474 "children": [{
475 "entity_id": 10868778,
476 "name": "2012 Desc Tr HBO Thalia",
477 "grouping": "legal_entity",
478 "columns": {
479 "time_weighted_return": -0.015251157805867277,
480 "time_weighted_return_2": -0.015251157805867277,
481 "_custom_twr_audit_note_911328": null,
482 "value": 800493.06146
483 },
484 "children": []
485 },
486 {
487 "entity_id": 10643052,
488 "name": "2013 Irrev Tr HBO Thalia",
489 "grouping": "legal_entity",
490 "columns": {
491 "time_weighted_return": -0.01739609576880241,
492 "time_weighted_return_2": -0.01739609576880241,
493 "_custom_twr_audit_note_911328": null,
494 "value": 2215511.2700000005
495 },
496 "children": []
497 },
498 {
499 "entity_id": 10598341,
500 "name": "Cht 11th Tr HBO Shirley",
501 "grouping": "legal_entity",
502 "columns": {
503 "time_weighted_return": -0.02085132265594647,
504 "time_weighted_return_2": -0.02085132265594647,
505 "_custom_twr_audit_note_911328": null,
506 "value": 44031.21
507 },
508 "children": []
509 },
510 {
511 "entity_id": 10598337,
512 "name": "Cht 11th Tr HBO Hannah",
513 "grouping": "legal_entity",
514 "columns": {
515 "time_weighted_return": -0.02089393244695803,
516 "time_weighted_return_2": -0.02089393244695803,
517 "_custom_twr_audit_note_911328": null,
518 "value": 44394.159999999996
519 },
520 "children": []
521 },
522 {
523 "entity_id": 10598334,
524 "name": "Cht 11th Tr HBO Lau",
525 "grouping": "legal_entity",
526 "columns": {
527 "time_weighted_return": -0.020607507059866248,
528 "time_weighted_return_2": -0.020607507059866248,
529 "_custom_twr_audit_note_911328": null,
530 "value": 44065.39000000001
531 },
532 "children": []
533 }
534 ]
535 },
536 {
537 "name": "Asset Class F",
538 "grouping": "_custom_level_2_624287",
539 "columns": {
540 "time_weighted_return": -0.0014710489231547497,
541 "time_weighted_return_2": -0.0014710489231547497,
542 "_custom_twr_audit_note_911328": null,
543 "value": 198260.12
544 },
545 "children": [{
546 "entity_id": 10868778,
547 "name": "2012 Desc Tr HBO Thalia",
548 "grouping": "legal_entity",
549 "columns": {
550 "time_weighted_return": -0.0014477244560456848,
551 "time_weighted_return_2": -0.0014477244560456848,
552 "_custom_twr_audit_note_911328": null,
553 "value": 44612.33
554 },
555 "children": []
556 },
557 {
558 "entity_id": 10643052,
559 "name": "2013 Irrev Tr HBO Thalia",
560 "grouping": "legal_entity",
561 "columns": {
562 "time_weighted_return": -0.001477821083437858,
563 "time_weighted_return_2": -0.001477821083437858,
564 "_custom_twr_audit_note_911328": null,
565 "value": 153647.78999999998
566 },
567 "children": []
568 }
569 ]
570 }
571 ]
572 }
573 ]
574 }]
575 }
576 },
577 "included": []
578 }
579}
580# Function to read API response / JSON Object
581def response_writer():
582 with open('api_response_2022-02-13.json') as f:
583 api_response = json.load(f)
584 return api_response
585
586# Function to unpack JSON response into pandas dataframe.
587def unpack_response():
588 while True:
589 try:
590 api_response = response_writer()
591 portfolio_views_children = api_response['data']['attributes']['total']['children']
592 portfolios = []
593 for portfolio in portfolio_views_children:
594 entity_columns = []
595 # include portfolio itself within an iterable so the total is the header
596 for entity in itertools.chain([portfolio], portfolio["children"]):
597 entity_data = entity["columns"].copy() # don't mutate original response
598 entity_data["portfolio"] = portfolio["name"] # from outer
599 entity_data["name"] = entity["name"]
600 entity_columns.append(entity_data)
601
602 df = pd.DataFrame(entity_columns)
603 portfolios.append(df)
604
605 # combine dataframes
606 df = pd.concat(portfolios)
607 # reorder and rename
608 column_ordering = {"portfolio": "portfolio", "name": "name"}
609 column_ordering.update({c["key"]: c["display_name"] for c in api_response["meta"]["columns"]})
610 df = df[column_ordering.keys()] # beware: un-named cols will be dropped
611 df = df.rename(columns=column_ordering)
612 break
613 except KeyError:
614 print("-----------------------------------\n","API TIMEOUT ERROR: TRY AGAIN...", "\n-----------------------------------\n")
615 return df
616unpack_response()
617
Help
In short, I am looking for some advice on how I can tap into the remaining children
by enhancing the existing code. Whilst I have taken much time to fully explain my problem, please ask if anything isn't clear. Please note that the JSON may have multiple families, so the solution / advice offered must observe this
ANSWER
Answered 2022-Feb-16 at 06:41I think this gets you pretty close; might just need to adjust the various name
columns and drop the extra data (I kept the grouping
column).
The main idea is to recursively use pd.json_normalize with pd.concat for all availalable children
levels.
EDIT: Put everything into a single function and added section to collapse the name
columns like the expected output.
1{
2 "meta": {
3 "columns": [{
4 "key": "value",
5 "display_name": "Adjusted Value (No Div, USD)",
6 "output_type": "Number",
7 "currency": "USD"
8 },
9 {
10 "key": "time_weighted_return",
11 "display_name": "Current Quarter TWR (USD)",
12 "output_type": "Percent",
13 "currency": "USD"
14 },
15 {
16 "key": "time_weighted_return_2",
17 "display_name": "YTD TWR (USD)",
18 "output_type": "Percent",
19 "currency": "USD"
20 },
21 {
22 "key": "_custom_twr_audit_note_911328",
23 "display_name": "TWR Audit Note",
24 "output_type": "Word"
25 }
26 ],
27 "groupings": [{
28 "key": "_custom_name_747205",
29 "display_name": "* Reporting Client Name"
30 },
31 {
32 "key": "_custom_new_entity_group_453577",
33 "display_name": "NEW Entity Group"
34 },
35 {
36 "key": "_custom_level_2_624287",
37 "display_name": "* Level 2"
38 },
39 {
40 "key": "legal_entity",
41 "display_name": "Legal Entity"
42 }
43 ]
44 },
45 "data": {
46 "type": "portfolio_views",
47 "attributes": {
48 "total": {
49 "name": "Total",
50 "columns": {
51 "time_weighted_return": -0.046732301295604683,
52 "time_weighted_return_2": -0.046732301295604683,
53 "_custom_twr_audit_note_911328": null,
54 "value": 23132492.905107163
55 },
56 "children": [{
57 "name": "Falconer Family",
58 "grouping": "_custom_name_747205",
59 "columns": {
60 "time_weighted_return": -0.046732301295604683,
61 "time_weighted_return_2": -0.046732301295604683,
62 "_custom_twr_audit_note_911328": null,
63 "value": 23132492.905107163
64 },
65 "children": [{
66 "name": "Wealth Bucket A",
67 "grouping": "_custom_new_entity_group_453577",
68 "columns": {
69 "time_weighted_return": -0.045960317420568164,
70 "time_weighted_return_2": -0.045960317420568164,
71 "_custom_twr_audit_note_911328": null,
72 "value": 13264448.506587159
73 },
74 "children": [{
75 "name": "Asset Class A",
76 "grouping": "_custom_level_2_624287",
77 "columns": {
78 "time_weighted_return": 0.000003434094574039648,
79 "time_weighted_return_2": 0.000003434094574039648,
80 "_custom_twr_audit_note_911328": null,
81 "value": 3337.99
82 },
83 "children": [{
84 "entity_id": 10604454,
85 "name": "HUDJ Trust",
86 "grouping": "legal_entity",
87 "columns": {
88 "time_weighted_return": 0.000003434094574039648,
89 "time_weighted_return_2": 0.000003434094574039648,
90 "_custom_twr_audit_note_911328": null,
91 "value": 3337.99
92 },
93 "children": []
94 }]
95 },
96 {
97 "name": "Asset Class B",
98 "grouping": "_custom_level_2_624287",
99 "columns": {
100 "time_weighted_return": -0.025871339096964152,
101 "time_weighted_return_2": -0.025871339096964152,
102 "_custom_twr_audit_note_911328": null,
103 "value": 1017004.7192636987
104 },
105 "children": [{
106 "entity_id": 10604454,
107 "name": "HUDG Trust",
108 "grouping": "legal_entity",
109 "columns": {
110 "time_weighted_return": -0.025871339096964152,
111 "time_weighted_return_2": -0.025871339096964152,
112 "_custom_twr_audit_note_911328": null,
113 "value": 1017004.7192636987
114 },
115 "children": []
116 }]
117 },
118 {
119 "name": "Asset Class C",
120 "grouping": "_custom_level_2_624287",
121 "columns": {
122 "time_weighted_return": -0.030370376329670656,
123 "time_weighted_return_2": -0.030370376329670656,
124 "_custom_twr_audit_note_911328": null,
125 "value": 231142.67772000004
126 },
127 "children": [{
128 "entity_id": 10604454,
129 "name": "HKDJ Trust",
130 "grouping": "legal_entity",
131 "columns": {
132 "time_weighted_return": -0.030370376329670656,
133 "time_weighted_return_2": -0.030370376329670656,
134 "_custom_twr_audit_note_911328": null,
135 "value": 231142.67772000004
136 },
137 "children": []
138 }]
139 },
140 {
141 "name": "Asset Class D",
142 "grouping": "_custom_level_2_624287",
143 "columns": {
144 "time_weighted_return": -0.05382756475465478,
145 "time_weighted_return_2": -0.05382756475465478,
146 "_custom_twr_audit_note_911328": null,
147 "value": 9791282.570000006
148 },
149 "children": [{
150 "entity_id": 10604454,
151 "name": "HUDW Trust",
152 "grouping": "legal_entity",
153 "columns": {
154 "time_weighted_return": -0.05382756475465478,
155 "time_weighted_return_2": -0.05382756475465478,
156 "_custom_twr_audit_note_911328": null,
157 "value": 9791282.570000006
158 },
159 "children": []
160 }]
161 },
162 {
163 "name": "Asset Class E",
164 "grouping": "_custom_level_2_624287",
165 "columns": {
166 "time_weighted_return": -0.01351630404081805,
167 "time_weighted_return_2": -0.01351630404081805,
168 "_custom_twr_audit_note_911328": null,
169 "value": 2153366.6396034593
170 },
171 "children": [{
172 "entity_id": 10604454,
173 "name": "HJDJ Trust",
174 "grouping": "legal_entity",
175 "columns": {
176 "time_weighted_return": -0.01351630404081805,
177 "time_weighted_return_2": -0.01351630404081805,
178 "_custom_twr_audit_note_911328": null,
179 "value": 2153366.6396034593
180 },
181 "children": []
182 }]
183 },
184 {
185 "name": "Asset Class F",
186 "grouping": "_custom_level_2_624287",
187 "columns": {
188 "time_weighted_return": -0.002298190175237247,
189 "time_weighted_return_2": -0.002298190175237247,
190 "_custom_twr_audit_note_911328": null,
191 "value": 68313.90999999999
192 },
193 "children": [{
194 "entity_id": 10604454,
195 "name": "HADJ Trust",
196 "grouping": "legal_entity",
197 "columns": {
198 "time_weighted_return": -0.002298190175237247,
199 "time_weighted_return_2": -0.002298190175237247,
200 "_custom_twr_audit_note_911328": null,
201 "value": 68313.90999999999
202 },
203 "children": []
204 }]
205 }
206 ]
207 },
208 {
209 "name": "Wealth Bucket B",
210 "grouping": "_custom_new_entity_group_453577",
211 "columns": {
212 "time_weighted_return": -0.04769870075659244,
213 "time_weighted_return_2": -0.04769870075659244,
214 "_custom_twr_audit_note_911328": null,
215 "value": 9868044.398519998
216 },
217 "children": [{
218 "name": "Asset Class A",
219 "grouping": "_custom_level_2_624287",
220 "columns": {
221 "time_weighted_return": 0.000028632718065191298,
222 "time_weighted_return_2": 0.000028632718065191298,
223 "_custom_twr_audit_note_911328": null,
224 "value": 10234.94
225 },
226 "children": [{
227 "entity_id": 10868778,
228 "name": "2012 Desc Tr HBO Thalia",
229 "grouping": "legal_entity",
230 "columns": {
231 "time_weighted_return": 0.0000282679297198829,
232 "time_weighted_return_2": 0.0000282679297198829,
233 "_custom_twr_audit_note_911328": null,
234 "value": 244.28
235 },
236 "children": []
237 },
238 {
239 "entity_id": 10643052,
240 "name": "2013 Irrev Tr HBO Thalia",
241 "grouping": "legal_entity",
242 "columns": {
243 "time_weighted_return": 0.000049373572795108345,
244 "time_weighted_return_2": 0.000049373572795108345,
245 "_custom_twr_audit_note_911328": null,
246 "value": 5081.08
247 },
248 "children": []
249 },
250 {
251 "entity_id": 10598341,
252 "name": "Cht 11th Tr HBO Shirley",
253 "grouping": "legal_entity",
254 "columns": {
255 "time_weighted_return": 0.000006609603754315074,
256 "time_weighted_return_2": 0.000006609603754315074,
257 "_custom_twr_audit_note_911328": null,
258 "value": 1523.62
259 },
260 "children": []
261 },
262 {
263 "entity_id": 10598337,
264 "name": "Cht 11th Tr HBO Hannah",
265 "grouping": "legal_entity",
266 "columns": {
267 "time_weighted_return": 0.000010999769004760296,
268 "time_weighted_return_2": 0.000010999769004760296,
269 "_custom_twr_audit_note_911328": null,
270 "value": 1828.9
271 },
272 "children": []
273 },
274 {
275 "entity_id": 10598334,
276 "name": "Cht 11th Tr HBO Lau",
277 "grouping": "legal_entity",
278 "columns": {
279 "time_weighted_return": 0.000006466673995619843,
280 "time_weighted_return_2": 0.000006466673995619843,
281 "_custom_twr_audit_note_911328": null,
282 "value": 1557.06
283 },
284 "children": []
285 }
286 ]
287 },
288 {
289 "name": "Asset Class B",
290 "grouping": "_custom_level_2_624287",
291 "columns": {
292 "time_weighted_return": -0.024645947842438676,
293 "time_weighted_return_2": -0.024645947842438676,
294 "_custom_twr_audit_note_911328": null,
295 "value": 674052.31962
296 },
297 "children": [{
298 "entity_id": 10868778,
299 "name": "2012 Desc Tr HBO Thalia",
300 "grouping": "legal_entity",
301 "columns": {
302 "time_weighted_return": -0.043304004172576405,
303 "time_weighted_return_2": -0.043304004172576405,
304 "_custom_twr_audit_note_911328": null,
305 "value": 52800.96
306 },
307 "children": []
308 },
309 {
310 "entity_id": 10643052,
311 "name": "2013 Irrev Tr HBO Thalia",
312 "grouping": "legal_entity",
313 "columns": {
314 "time_weighted_return": -0.022408434778798836,
315 "time_weighted_return_2": -0.022408434778798836,
316 "_custom_twr_audit_note_911328": null,
317 "value": 599594.11962
318 },
319 "children": []
320 },
321 {
322 "entity_id": 10598341,
323 "name": "Cht 11th Tr HBO Shirley",
324 "grouping": "legal_entity",
325 "columns": {
326 "time_weighted_return": -0.039799855483646174,
327 "time_weighted_return_2": -0.039799855483646174,
328 "_custom_twr_audit_note_911328": null,
329 "value": 7219.08
330 },
331 "children": []
332 },
333 {
334 "entity_id": 10598337,
335 "name": "Cht 11th Tr HBO Hannah",
336 "grouping": "legal_entity",
337 "columns": {
338 "time_weighted_return": -0.039799855483646174,
339 "time_weighted_return_2": -0.039799855483646174,
340 "_custom_twr_audit_note_911328": null,
341 "value": 7219.08
342 },
343 "children": []
344 },
345 {
346 "entity_id": 10598334,
347 "name": "Cht 11th Tr HBO Lau",
348 "grouping": "legal_entity",
349 "columns": {
350 "time_weighted_return": -0.039799855483646174,
351 "time_weighted_return_2": -0.039799855483646174,
352 "_custom_twr_audit_note_911328": null,
353 "value": 7219.08
354 },
355 "children": []
356 }
357 ]
358 },
359 {
360 "name": "Asset Class C",
361 "grouping": "_custom_level_2_624287",
362 "columns": {
363 "time_weighted_return": -0.03037038746301135,
364 "time_weighted_return_2": -0.03037038746301135,
365 "_custom_twr_audit_note_911328": null,
366 "value": 114472.69744
367 },
368 "children": [{
369 "entity_id": 10868778,
370 "name": "2012 Desc Tr HBO Thalia",
371 "grouping": "legal_entity",
372 "columns": {
373 "time_weighted_return": -0.030370390035505124,
374 "time_weighted_return_2": -0.030370390035505124,
375 "_custom_twr_audit_note_911328": null,
376 "value": 114472.68744000001
377 },
378 "children": []
379 },
380 {
381 "entity_id": 10643052,
382 "name": "2013 Irrev Tr HBO Thalia",
383 "grouping": "legal_entity",
384 "columns": {
385 "time_weighted_return": 0,
386 "time_weighted_return_2": 0,
387 "_custom_twr_audit_note_911328": null,
388 "value": 0.01
389 },
390 "children": []
391 }
392 ]
393 },
394 {
395 "name": "Asset Class D",
396 "grouping": "_custom_level_2_624287",
397 "columns": {
398 "time_weighted_return": -0.06604362523792162,
399 "time_weighted_return_2": -0.06604362523792162,
400 "_custom_twr_audit_note_911328": null,
401 "value": 5722529.229999997
402 },
403 "children": [{
404 "entity_id": 10868778,
405 "name": "2012 Desc Tr HBO Thalia",
406 "grouping": "legal_entity",
407 "columns": {
408 "time_weighted_return": -0.06154960593668424,
409 "time_weighted_return_2": -0.06154960593668424,
410 "_custom_twr_audit_note_911328": null,
411 "value": 1191838.9399999995
412 },
413 "children": []
414 },
415 {
416 "entity_id": 10643052,
417 "name": "2013 Irrev Tr HBO Thalia",
418 "grouping": "legal_entity",
419 "columns": {
420 "time_weighted_return": -0.06750460387418267,
421 "time_weighted_return_2": -0.06750460387418267,
422 "_custom_twr_audit_note_911328": null,
423 "value": 4416618.520000002
424 },
425 "children": []
426 },
427 {
428 "entity_id": 10598341,
429 "name": "Cht 11th Tr HBO Shirley",
430 "grouping": "legal_entity",
431 "columns": {
432 "time_weighted_return": -0.05604507809250081,
433 "time_weighted_return_2": -0.05604507809250081,
434 "_custom_twr_audit_note_911328": null,
435 "value": 38190.33
436 },
437 "children": []
438 },
439 {
440 "entity_id": 10598337,
441 "name": "Cht 11th Tr HBO Hannah",
442 "grouping": "legal_entity",
443 "columns": {
444 "time_weighted_return": -0.05604507809250081,
445 "time_weighted_return_2": -0.05604507809250081,
446 "_custom_twr_audit_note_911328": null,
447 "value": 37940.72
448 },
449 "children": []
450 },
451 {
452 "entity_id": 10598334,
453 "name": "Cht 11th Tr HBO Lau",
454 "grouping": "legal_entity",
455 "columns": {
456 "time_weighted_return": -0.05604507809250081,
457 "time_weighted_return_2": -0.05604507809250081,
458 "_custom_twr_audit_note_911328": null,
459 "value": 37940.72
460 },
461 "children": []
462 }
463 ]
464 },
465 {
466 "name": "Asset Class E",
467 "grouping": "_custom_level_2_624287",
468 "columns": {
469 "time_weighted_return": -0.017118805423322003,
470 "time_weighted_return_2": -0.017118805423322003,
471 "_custom_twr_audit_note_911328": null,
472 "value": 3148495.0914600003
473 },
474 "children": [{
475 "entity_id": 10868778,
476 "name": "2012 Desc Tr HBO Thalia",
477 "grouping": "legal_entity",
478 "columns": {
479 "time_weighted_return": -0.015251157805867277,
480 "time_weighted_return_2": -0.015251157805867277,
481 "_custom_twr_audit_note_911328": null,
482 "value": 800493.06146
483 },
484 "children": []
485 },
486 {
487 "entity_id": 10643052,
488 "name": "2013 Irrev Tr HBO Thalia",
489 "grouping": "legal_entity",
490 "columns": {
491 "time_weighted_return": -0.01739609576880241,
492 "time_weighted_return_2": -0.01739609576880241,
493 "_custom_twr_audit_note_911328": null,
494 "value": 2215511.2700000005
495 },
496 "children": []
497 },
498 {
499 "entity_id": 10598341,
500 "name": "Cht 11th Tr HBO Shirley",
501 "grouping": "legal_entity",
502 "columns": {
503 "time_weighted_return": -0.02085132265594647,
504 "time_weighted_return_2": -0.02085132265594647,
505 "_custom_twr_audit_note_911328": null,
506 "value": 44031.21
507 },
508 "children": []
509 },
510 {
511 "entity_id": 10598337,
512 "name": "Cht 11th Tr HBO Hannah",
513 "grouping": "legal_entity",
514 "columns": {
515 "time_weighted_return": -0.02089393244695803,
516 "time_weighted_return_2": -0.02089393244695803,
517 "_custom_twr_audit_note_911328": null,
518 "value": 44394.159999999996
519 },
520 "children": []
521 },
522 {
523 "entity_id": 10598334,
524 "name": "Cht 11th Tr HBO Lau",
525 "grouping": "legal_entity",
526 "columns": {
527 "time_weighted_return": -0.020607507059866248,
528 "time_weighted_return_2": -0.020607507059866248,
529 "_custom_twr_audit_note_911328": null,
530 "value": 44065.39000000001
531 },
532 "children": []
533 }
534 ]
535 },
536 {
537 "name": "Asset Class F",
538 "grouping": "_custom_level_2_624287",
539 "columns": {
540 "time_weighted_return": -0.0014710489231547497,
541 "time_weighted_return_2": -0.0014710489231547497,
542 "_custom_twr_audit_note_911328": null,
543 "value": 198260.12
544 },
545 "children": [{
546 "entity_id": 10868778,
547 "name": "2012 Desc Tr HBO Thalia",
548 "grouping": "legal_entity",
549 "columns": {
550 "time_weighted_return": -0.0014477244560456848,
551 "time_weighted_return_2": -0.0014477244560456848,
552 "_custom_twr_audit_note_911328": null,
553 "value": 44612.33
554 },
555 "children": []
556 },
557 {
558 "entity_id": 10643052,
559 "name": "2013 Irrev Tr HBO Thalia",
560 "grouping": "legal_entity",
561 "columns": {
562 "time_weighted_return": -0.001477821083437858,
563 "time_weighted_return_2": -0.001477821083437858,
564 "_custom_twr_audit_note_911328": null,
565 "value": 153647.78999999998
566 },
567 "children": []
568 }
569 ]
570 }
571 ]
572 }
573 ]
574 }]
575 }
576 },
577 "included": []
578 }
579}
580# Function to read API response / JSON Object
581def response_writer():
582 with open('api_response_2022-02-13.json') as f:
583 api_response = json.load(f)
584 return api_response
585
586# Function to unpack JSON response into pandas dataframe.
587def unpack_response():
588 while True:
589 try:
590 api_response = response_writer()
591 portfolio_views_children = api_response['data']['attributes']['total']['children']
592 portfolios = []
593 for portfolio in portfolio_views_children:
594 entity_columns = []
595 # include portfolio itself within an iterable so the total is the header
596 for entity in itertools.chain([portfolio], portfolio["children"]):
597 entity_data = entity["columns"].copy() # don't mutate original response
598 entity_data["portfolio"] = portfolio["name"] # from outer
599 entity_data["name"] = entity["name"]
600 entity_columns.append(entity_data)
601
602 df = pd.DataFrame(entity_columns)
603 portfolios.append(df)
604
605 # combine dataframes
606 df = pd.concat(portfolios)
607 # reorder and rename
608 column_ordering = {"portfolio": "portfolio", "name": "name"}
609 column_ordering.update({c["key"]: c["display_name"] for c in api_response["meta"]["columns"]})
610 df = df[column_ordering.keys()] # beware: un-named cols will be dropped
611 df = df.rename(columns=column_ordering)
612 break
613 except KeyError:
614 print("-----------------------------------\n","API TIMEOUT ERROR: TRY AGAIN...", "\n-----------------------------------\n")
615 return df
616unpack_response()
617def process_json(api_response):
618
619 def get_column_values(df):
620 return pd.concat([df, pd.json_normalize(df.pop('columns')).set_axis(df.index)], axis=1)
621
622 def expand_children(df):
623 if len(df.index) > 1:
624 df['children'] = df['children'].fillna('').apply(lambda x: None if len(x) == 0 else x)
625 df_children = df.pop('children').dropna().explode()
626 if len(df_children.index) == 0: # return df if no children to append
627 return df.index.names, df
628 df_children = pd.json_normalize(df_children, max_level=0).set_axis(df_children.index).set_index('name', append=True)
629 df_children = get_column_values(df_children)
630 idx_names = list(df_children.index.names)
631 idx_names[-1] = idx_names[-1] + '_' + str(len(idx_names))
632 df[idx_names[-1]] = None
633 return idx_names, pd.concat([df.set_index(idx_names[-1], append=True), df_children], axis=0)
634
635 columns_dict = pd.DataFrame(api_response['meta']['columns']).set_index('key').to_dict(orient='index') # save column definitions
636 df = pd.DataFrame(api_response['data']['attributes']['total']['children']).set_index('name') # get initial dataframe
637 df = get_column_values(df) # get columns for initial level
638
639 # expand children
640 while 'children' in df.columns:
641 idx_names, df = expand_children(df)
642
643 # reorder/replace column headers and sort index
644 df = (df.loc[:, [x for x in df.columns if x not in columns_dict.keys()] + list(columns_dict.keys())]
645 .rename(columns={k:v['display_name'] for k,v in columns_dict.items()})
646 .sort_index(na_position='first').reset_index())
647
648 #collapse "name" columns (careful of potential duplicate rows)
649 for col in idx_names[::-1]:
650 df[idx_names[-1]] = df[idx_names[-1]].fillna(df[col])
651 df = df.rename(columns={'name': 'portfolio', idx_names[-1]: 'name'}).drop(columns=idx_names[1:-1])
652
653 return df
654
Since the other answer uses iterrows
, which usually isn't advised, figured a quick time compare was worthwhile.
1{
2 "meta": {
3 "columns": [{
4 "key": "value",
5 "display_name": "Adjusted Value (No Div, USD)",
6 "output_type": "Number",
7 "currency": "USD"
8 },
9 {
10 "key": "time_weighted_return",
11 "display_name": "Current Quarter TWR (USD)",
12 "output_type": "Percent",
13 "currency": "USD"
14 },
15 {
16 "key": "time_weighted_return_2",
17 "display_name": "YTD TWR (USD)",
18 "output_type": "Percent",
19 "currency": "USD"
20 },
21 {
22 "key": "_custom_twr_audit_note_911328",
23 "display_name": "TWR Audit Note",
24 "output_type": "Word"
25 }
26 ],
27 "groupings": [{
28 "key": "_custom_name_747205",
29 "display_name": "* Reporting Client Name"
30 },
31 {
32 "key": "_custom_new_entity_group_453577",
33 "display_name": "NEW Entity Group"
34 },
35 {
36 "key": "_custom_level_2_624287",
37 "display_name": "* Level 2"
38 },
39 {
40 "key": "legal_entity",
41 "display_name": "Legal Entity"
42 }
43 ]
44 },
45 "data": {
46 "type": "portfolio_views",
47 "attributes": {
48 "total": {
49 "name": "Total",
50 "columns": {
51 "time_weighted_return": -0.046732301295604683,
52 "time_weighted_return_2": -0.046732301295604683,
53 "_custom_twr_audit_note_911328": null,
54 "value": 23132492.905107163
55 },
56 "children": [{
57 "name": "Falconer Family",
58 "grouping": "_custom_name_747205",
59 "columns": {
60 "time_weighted_return": -0.046732301295604683,
61 "time_weighted_return_2": -0.046732301295604683,
62 "_custom_twr_audit_note_911328": null,
63 "value": 23132492.905107163
64 },
65 "children": [{
66 "name": "Wealth Bucket A",
67 "grouping": "_custom_new_entity_group_453577",
68 "columns": {
69 "time_weighted_return": -0.045960317420568164,
70 "time_weighted_return_2": -0.045960317420568164,
71 "_custom_twr_audit_note_911328": null,
72 "value": 13264448.506587159
73 },
74 "children": [{
75 "name": "Asset Class A",
76 "grouping": "_custom_level_2_624287",
77 "columns": {
78 "time_weighted_return": 0.000003434094574039648,
79 "time_weighted_return_2": 0.000003434094574039648,
80 "_custom_twr_audit_note_911328": null,
81 "value": 3337.99
82 },
83 "children": [{
84 "entity_id": 10604454,
85 "name": "HUDJ Trust",
86 "grouping": "legal_entity",
87 "columns": {
88 "time_weighted_return": 0.000003434094574039648,
89 "time_weighted_return_2": 0.000003434094574039648,
90 "_custom_twr_audit_note_911328": null,
91 "value": 3337.99
92 },
93 "children": []
94 }]
95 },
96 {
97 "name": "Asset Class B",
98 "grouping": "_custom_level_2_624287",
99 "columns": {
100 "time_weighted_return": -0.025871339096964152,
101 "time_weighted_return_2": -0.025871339096964152,
102 "_custom_twr_audit_note_911328": null,
103 "value": 1017004.7192636987
104 },
105 "children": [{
106 "entity_id": 10604454,
107 "name": "HUDG Trust",
108 "grouping": "legal_entity",
109 "columns": {
110 "time_weighted_return": -0.025871339096964152,
111 "time_weighted_return_2": -0.025871339096964152,
112 "_custom_twr_audit_note_911328": null,
113 "value": 1017004.7192636987
114 },
115 "children": []
116 }]
117 },
118 {
119 "name": "Asset Class C",
120 "grouping": "_custom_level_2_624287",
121 "columns": {
122 "time_weighted_return": -0.030370376329670656,
123 "time_weighted_return_2": -0.030370376329670656,
124 "_custom_twr_audit_note_911328": null,
125 "value": 231142.67772000004
126 },
127 "children": [{
128 "entity_id": 10604454,
129 "name": "HKDJ Trust",
130 "grouping": "legal_entity",
131 "columns": {
132 "time_weighted_return": -0.030370376329670656,
133 "time_weighted_return_2": -0.030370376329670656,
134 "_custom_twr_audit_note_911328": null,
135 "value": 231142.67772000004
136 },
137 "children": []
138 }]
139 },
140 {
141 "name": "Asset Class D",
142 "grouping": "_custom_level_2_624287",
143 "columns": {
144 "time_weighted_return": -0.05382756475465478,
145 "time_weighted_return_2": -0.05382756475465478,
146 "_custom_twr_audit_note_911328": null,
147 "value": 9791282.570000006
148 },
149 "children": [{
150 "entity_id": 10604454,
151 "name": "HUDW Trust",
152 "grouping": "legal_entity",
153 "columns": {
154 "time_weighted_return": -0.05382756475465478,
155 "time_weighted_return_2": -0.05382756475465478,
156 "_custom_twr_audit_note_911328": null,
157 "value": 9791282.570000006
158 },
159 "children": []
160 }]
161 },
162 {
163 "name": "Asset Class E",
164 "grouping": "_custom_level_2_624287",
165 "columns": {
166 "time_weighted_return": -0.01351630404081805,
167 "time_weighted_return_2": -0.01351630404081805,
168 "_custom_twr_audit_note_911328": null,
169 "value": 2153366.6396034593
170 },
171 "children": [{
172 "entity_id": 10604454,
173 "name": "HJDJ Trust",
174 "grouping": "legal_entity",
175 "columns": {
176 "time_weighted_return": -0.01351630404081805,
177 "time_weighted_return_2": -0.01351630404081805,
178 "_custom_twr_audit_note_911328": null,
179 "value": 2153366.6396034593
180 },
181 "children": []
182 }]
183 },
184 {
185 "name": "Asset Class F",
186 "grouping": "_custom_level_2_624287",
187 "columns": {
188 "time_weighted_return": -0.002298190175237247,
189 "time_weighted_return_2": -0.002298190175237247,
190 "_custom_twr_audit_note_911328": null,
191 "value": 68313.90999999999
192 },
193 "children": [{
194 "entity_id": 10604454,
195 "name": "HADJ Trust",
196 "grouping": "legal_entity",
197 "columns": {
198 "time_weighted_return": -0.002298190175237247,
199 "time_weighted_return_2": -0.002298190175237247,
200 "_custom_twr_audit_note_911328": null,
201 "value": 68313.90999999999
202 },
203 "children": []
204 }]
205 }
206 ]
207 },
208 {
209 "name": "Wealth Bucket B",
210 "grouping": "_custom_new_entity_group_453577",
211 "columns": {
212 "time_weighted_return": -0.04769870075659244,
213 "time_weighted_return_2": -0.04769870075659244,
214 "_custom_twr_audit_note_911328": null,
215 "value": 9868044.398519998
216 },
217 "children": [{
218 "name": "Asset Class A",
219 "grouping": "_custom_level_2_624287",
220 "columns": {
221 "time_weighted_return": 0.000028632718065191298,
222 "time_weighted_return_2": 0.000028632718065191298,
223 "_custom_twr_audit_note_911328": null,
224 "value": 10234.94
225 },
226 "children": [{
227 "entity_id": 10868778,
228 "name": "2012 Desc Tr HBO Thalia",
229 "grouping": "legal_entity",
230 "columns": {
231 "time_weighted_return": 0.0000282679297198829,
232 "time_weighted_return_2": 0.0000282679297198829,
233 "_custom_twr_audit_note_911328": null,
234 "value": 244.28
235 },
236 "children": []
237 },
238 {
239 "entity_id": 10643052,
240 "name": "2013 Irrev Tr HBO Thalia",
241 "grouping": "legal_entity",
242 "columns": {
243 "time_weighted_return": 0.000049373572795108345,
244 "time_weighted_return_2": 0.000049373572795108345,
245 "_custom_twr_audit_note_911328": null,
246 "value": 5081.08
247 },
248 "children": []
249 },
250 {
251 "entity_id": 10598341,
252 "name": "Cht 11th Tr HBO Shirley",
253 "grouping": "legal_entity",
254 "columns": {
255 "time_weighted_return": 0.000006609603754315074,
256 "time_weighted_return_2": 0.000006609603754315074,
257 "_custom_twr_audit_note_911328": null,
258 "value": 1523.62
259 },
260 "children": []
261 },
262 {
263 "entity_id": 10598337,
264 "name": "Cht 11th Tr HBO Hannah",
265 "grouping": "legal_entity",
266 "columns": {
267 "time_weighted_return": 0.000010999769004760296,
268 "time_weighted_return_2": 0.000010999769004760296,
269 "_custom_twr_audit_note_911328": null,
270 "value": 1828.9
271 },
272 "children": []
273 },
274 {
275 "entity_id": 10598334,
276 "name": "Cht 11th Tr HBO Lau",
277 "grouping": "legal_entity",
278 "columns": {
279 "time_weighted_return": 0.000006466673995619843,
280 "time_weighted_return_2": 0.000006466673995619843,
281 "_custom_twr_audit_note_911328": null,
282 "value": 1557.06
283 },
284 "children": []
285 }
286 ]
287 },
288 {
289 "name": "Asset Class B",
290 "grouping": "_custom_level_2_624287",
291 "columns": {
292 "time_weighted_return": -0.024645947842438676,
293 "time_weighted_return_2": -0.024645947842438676,
294 "_custom_twr_audit_note_911328": null,
295 "value": 674052.31962
296 },
297 "children": [{
298 "entity_id": 10868778,
299 "name": "2012 Desc Tr HBO Thalia",
300 "grouping": "legal_entity",
301 "columns": {
302 "time_weighted_return": -0.043304004172576405,
303 "time_weighted_return_2": -0.043304004172576405,
304 "_custom_twr_audit_note_911328": null,
305 "value": 52800.96
306 },
307 "children": []
308 },
309 {
310 "entity_id": 10643052,
311 "name": "2013 Irrev Tr HBO Thalia",
312 "grouping": "legal_entity",
313 "columns": {
314 "time_weighted_return": -0.022408434778798836,
315 "time_weighted_return_2": -0.022408434778798836,
316 "_custom_twr_audit_note_911328": null,
317 "value": 599594.11962
318 },
319 "children": []
320 },
321 {
322 "entity_id": 10598341,
323 "name": "Cht 11th Tr HBO Shirley",
324 "grouping": "legal_entity",
325 "columns": {
326 "time_weighted_return": -0.039799855483646174,
327 "time_weighted_return_2": -0.039799855483646174,
328 "_custom_twr_audit_note_911328": null,
329 "value": 7219.08
330 },
331 "children": []
332 },
333 {
334 "entity_id": 10598337,
335 "name": "Cht 11th Tr HBO Hannah",
336 "grouping": "legal_entity",
337 "columns": {
338 "time_weighted_return": -0.039799855483646174,
339 "time_weighted_return_2": -0.039799855483646174,
340 "_custom_twr_audit_note_911328": null,
341 "value": 7219.08
342 },
343 "children": []
344 },
345 {
346 "entity_id": 10598334,
347 "name": "Cht 11th Tr HBO Lau",
348 "grouping": "legal_entity",
349 "columns": {
350 "time_weighted_return": -0.039799855483646174,
351 "time_weighted_return_2": -0.039799855483646174,
352 "_custom_twr_audit_note_911328": null,
353 "value": 7219.08
354 },
355 "children": []
356 }
357 ]
358 },
359 {
360 "name": "Asset Class C",
361 "grouping": "_custom_level_2_624287",
362 "columns": {
363 "time_weighted_return": -0.03037038746301135,
364 "time_weighted_return_2": -0.03037038746301135,
365 "_custom_twr_audit_note_911328": null,
366 "value": 114472.69744
367 },
368 "children": [{
369 "entity_id": 10868778,
370 "name": "2012 Desc Tr HBO Thalia",
371 "grouping": "legal_entity",
372 "columns": {
373 "time_weighted_return": -0.030370390035505124,
374 "time_weighted_return_2": -0.030370390035505124,
375 "_custom_twr_audit_note_911328": null,
376 "value": 114472.68744000001
377 },
378 "children": []
379 },
380 {
381 "entity_id": 10643052,
382 "name": "2013 Irrev Tr HBO Thalia",
383 "grouping": "legal_entity",
384 "columns": {
385 "time_weighted_return": 0,
386 "time_weighted_return_2": 0,
387 "_custom_twr_audit_note_911328": null,
388 "value": 0.01
389 },
390 "children": []
391 }
392 ]
393 },
394 {
395 "name": "Asset Class D",
396 "grouping": "_custom_level_2_624287",
397 "columns": {
398 "time_weighted_return": -0.06604362523792162,
399 "time_weighted_return_2": -0.06604362523792162,
400 "_custom_twr_audit_note_911328": null,
401 "value": 5722529.229999997
402 },
403 "children": [{
404 "entity_id": 10868778,
405 "name": "2012 Desc Tr HBO Thalia",
406 "grouping": "legal_entity",
407 "columns": {
408 "time_weighted_return": -0.06154960593668424,
409 "time_weighted_return_2": -0.06154960593668424,
410 "_custom_twr_audit_note_911328": null,
411 "value": 1191838.9399999995
412 },
413 "children": []
414 },
415 {
416 "entity_id": 10643052,
417 "name": "2013 Irrev Tr HBO Thalia",
418 "grouping": "legal_entity",
419 "columns": {
420 "time_weighted_return": -0.06750460387418267,
421 "time_weighted_return_2": -0.06750460387418267,
422 "_custom_twr_audit_note_911328": null,
423 "value": 4416618.520000002
424 },
425 "children": []
426 },
427 {
428 "entity_id": 10598341,
429 "name": "Cht 11th Tr HBO Shirley",
430 "grouping": "legal_entity",
431 "columns": {
432 "time_weighted_return": -0.05604507809250081,
433 "time_weighted_return_2": -0.05604507809250081,
434 "_custom_twr_audit_note_911328": null,
435 "value": 38190.33
436 },
437 "children": []
438 },
439 {
440 "entity_id": 10598337,
441 "name": "Cht 11th Tr HBO Hannah",
442 "grouping": "legal_entity",
443 "columns": {
444 "time_weighted_return": -0.05604507809250081,
445 "time_weighted_return_2": -0.05604507809250081,
446 "_custom_twr_audit_note_911328": null,
447 "value": 37940.72
448 },
449 "children": []
450 },
451 {
452 "entity_id": 10598334,
453 "name": "Cht 11th Tr HBO Lau",
454 "grouping": "legal_entity",
455 "columns": {
456 "time_weighted_return": -0.05604507809250081,
457 "time_weighted_return_2": -0.05604507809250081,
458 "_custom_twr_audit_note_911328": null,
459 "value": 37940.72
460 },
461 "children": []
462 }
463 ]
464 },
465 {
466 "name": "Asset Class E",
467 "grouping": "_custom_level_2_624287",
468 "columns": {
469 "time_weighted_return": -0.017118805423322003,
470 "time_weighted_return_2": -0.017118805423322003,
471 "_custom_twr_audit_note_911328": null,
472 "value": 3148495.0914600003
473 },
474 "children": [{
475 "entity_id": 10868778,
476 "name": "2012 Desc Tr HBO Thalia",
477 "grouping": "legal_entity",
478 "columns": {
479 "time_weighted_return": -0.015251157805867277,
480 "time_weighted_return_2": -0.015251157805867277,
481 "_custom_twr_audit_note_911328": null,
482 "value": 800493.06146
483 },
484 "children": []
485 },
486 {
487 "entity_id": 10643052,
488 "name": "2013 Irrev Tr HBO Thalia",
489 "grouping": "legal_entity",
490 "columns": {
491 "time_weighted_return": -0.01739609576880241,
492 "time_weighted_return_2": -0.01739609576880241,
493 "_custom_twr_audit_note_911328": null,
494 "value": 2215511.2700000005
495 },
496 "children": []
497 },
498 {
499 "entity_id": 10598341,
500 "name": "Cht 11th Tr HBO Shirley",
501 "grouping": "legal_entity",
502 "columns": {
503 "time_weighted_return": -0.02085132265594647,
504 "time_weighted_return_2": -0.02085132265594647,
505 "_custom_twr_audit_note_911328": null,
506 "value": 44031.21
507 },
508 "children": []
509 },
510 {
511 "entity_id": 10598337,
512 "name": "Cht 11th Tr HBO Hannah",
513 "grouping": "legal_entity",
514 "columns": {
515 "time_weighted_return": -0.02089393244695803,
516 "time_weighted_return_2": -0.02089393244695803,
517 "_custom_twr_audit_note_911328": null,
518 "value": 44394.159999999996
519 },
520 "children": []
521 },
522 {
523 "entity_id": 10598334,
524 "name": "Cht 11th Tr HBO Lau",
525 "grouping": "legal_entity",
526 "columns": {
527 "time_weighted_return": -0.020607507059866248,
528 "time_weighted_return_2": -0.020607507059866248,
529 "_custom_twr_audit_note_911328": null,
530 "value": 44065.39000000001
531 },
532 "children": []
533 }
534 ]
535 },
536 {
537 "name": "Asset Class F",
538 "grouping": "_custom_level_2_624287",
539 "columns": {
540 "time_weighted_return": -0.0014710489231547497,
541 "time_weighted_return_2": -0.0014710489231547497,
542 "_custom_twr_audit_note_911328": null,
543 "value": 198260.12
544 },
545 "children": [{
546 "entity_id": 10868778,
547 "name": "2012 Desc Tr HBO Thalia",
548 "grouping": "legal_entity",
549 "columns": {
550 "time_weighted_return": -0.0014477244560456848,
551 "time_weighted_return_2": -0.0014477244560456848,
552 "_custom_twr_audit_note_911328": null,
553 "value": 44612.33
554 },
555 "children": []
556 },
557 {
558 "entity_id": 10643052,
559 "name": "2013 Irrev Tr HBO Thalia",
560 "grouping": "legal_entity",
561 "columns": {
562 "time_weighted_return": -0.001477821083437858,
563 "time_weighted_return_2": -0.001477821083437858,
564 "_custom_twr_audit_note_911328": null,
565 "value": 153647.78999999998
566 },
567 "children": []
568 }
569 ]
570 }
571 ]
572 }
573 ]
574 }]
575 }
576 },
577 "included": []
578 }
579}
580# Function to read API response / JSON Object
581def response_writer():
582 with open('api_response_2022-02-13.json') as f:
583 api_response = json.load(f)
584 return api_response
585
586# Function to unpack JSON response into pandas dataframe.
587def unpack_response():
588 while True:
589 try:
590 api_response = response_writer()
591 portfolio_views_children = api_response['data']['attributes']['total']['children']
592 portfolios = []
593 for portfolio in portfolio_views_children:
594 entity_columns = []
595 # include portfolio itself within an iterable so the total is the header
596 for entity in itertools.chain([portfolio], portfolio["children"]):
597 entity_data = entity["columns"].copy() # don't mutate original response
598 entity_data["portfolio"] = portfolio["name"] # from outer
599 entity_data["name"] = entity["name"]
600 entity_columns.append(entity_data)
601
602 df = pd.DataFrame(entity_columns)
603 portfolios.append(df)
604
605 # combine dataframes
606 df = pd.concat(portfolios)
607 # reorder and rename
608 column_ordering = {"portfolio": "portfolio", "name": "name"}
609 column_ordering.update({c["key"]: c["display_name"] for c in api_response["meta"]["columns"]})
610 df = df[column_ordering.keys()] # beware: un-named cols will be dropped
611 df = df.rename(columns=column_ordering)
612 break
613 except KeyError:
614 print("-----------------------------------\n","API TIMEOUT ERROR: TRY AGAIN...", "\n-----------------------------------\n")
615 return df
616unpack_response()
617def process_json(api_response):
618
619 def get_column_values(df):
620 return pd.concat([df, pd.json_normalize(df.pop('columns')).set_axis(df.index)], axis=1)
621
622 def expand_children(df):
623 if len(df.index) > 1:
624 df['children'] = df['children'].fillna('').apply(lambda x: None if len(x) == 0 else x)
625 df_children = df.pop('children').dropna().explode()
626 if len(df_children.index) == 0: # return df if no children to append
627 return df.index.names, df
628 df_children = pd.json_normalize(df_children, max_level=0).set_axis(df_children.index).set_index('name', append=True)
629 df_children = get_column_values(df_children)
630 idx_names = list(df_children.index.names)
631 idx_names[-1] = idx_names[-1] + '_' + str(len(idx_names))
632 df[idx_names[-1]] = None
633 return idx_names, pd.concat([df.set_index(idx_names[-1], append=True), df_children], axis=0)
634
635 columns_dict = pd.DataFrame(api_response['meta']['columns']).set_index('key').to_dict(orient='index') # save column definitions
636 df = pd.DataFrame(api_response['data']['attributes']['total']['children']).set_index('name') # get initial dataframe
637 df = get_column_values(df) # get columns for initial level
638
639 # expand children
640 while 'children' in df.columns:
641 idx_names, df = expand_children(df)
642
643 # reorder/replace column headers and sort index
644 df = (df.loc[:, [x for x in df.columns if x not in columns_dict.keys()] + list(columns_dict.keys())]
645 .rename(columns={k:v['display_name'] for k,v in columns_dict.items()})
646 .sort_index(na_position='first').reset_index())
647
648 #collapse "name" columns (careful of potential duplicate rows)
649 for col in idx_names[::-1]:
650 df[idx_names[-1]] = df[idx_names[-1]].fillna(df[col])
651 df = df.rename(columns={'name': 'portfolio', idx_names[-1]: 'name'}).drop(columns=idx_names[1:-1])
652
653 return df
654process_json(api_response)
65554.2 ms ± 7.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
656
657unpack_response(api_response) # iterrows
65884.3 ms ± 9.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
659
QUESTION
AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks'>
Asked 2022-Feb-25 at 13:18I was using pyspark on AWS EMR (4 r5.xlarge as 4 workers, each has one executor and 4 cores), and I got AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks'
. Below is a snippet of the code that threw this error:
1search = SearchEngine(db_file_dir = "/tmp/db")
2conn = sqlite3.connect("/tmp/db/simple_db.sqlite")
3pdf_ = pd.read_sql_query('''select zipcode, lat, lng,
4 bounds_west, bounds_east, bounds_north, bounds_south from
5 simple_zipcode''',conn)
6brd_pdf = spark.sparkContext.broadcast(pdf_)
7conn.close()
8
9
10@udf('string')
11def get_zip_b(lat, lng):
12 pdf = brd_pdf.value
13 out = pdf[(np.array(pdf["bounds_north"]) >= lat) &
14 (np.array(pdf["bounds_south"]) <= lat) &
15 (np.array(pdf['bounds_west']) <= lng) &
16 (np.array(pdf['bounds_east']) >= lng) ]
17 if len(out):
18 min_index = np.argmin( (np.array(out["lat"]) - lat)**2 + (np.array(out["lng"]) - lng)**2)
19 zip_ = str(out["zipcode"].iloc[min_index])
20 else:
21 zip_ = 'bad'
22 return zip_
23
24df = df.withColumn('zipcode', get_zip_b(col("latitude"),col("longitude")))
25
Below is the traceback, where line 102, in get_zip_b refers to pdf = brd_pdf.value
:
1search = SearchEngine(db_file_dir = "/tmp/db")
2conn = sqlite3.connect("/tmp/db/simple_db.sqlite")
3pdf_ = pd.read_sql_query('''select zipcode, lat, lng,
4 bounds_west, bounds_east, bounds_north, bounds_south from
5 simple_zipcode''',conn)
6brd_pdf = spark.sparkContext.broadcast(pdf_)
7conn.close()
8
9
10@udf('string')
11def get_zip_b(lat, lng):
12 pdf = brd_pdf.value
13 out = pdf[(np.array(pdf["bounds_north"]) >= lat) &
14 (np.array(pdf["bounds_south"]) <= lat) &
15 (np.array(pdf['bounds_west']) <= lng) &
16 (np.array(pdf['bounds_east']) >= lng) ]
17 if len(out):
18 min_index = np.argmin( (np.array(out["lat"]) - lat)**2 + (np.array(out["lng"]) - lng)**2)
19 zip_ = str(out["zipcode"].iloc[min_index])
20 else:
21 zip_ = 'bad'
22 return zip_
23
24df = df.withColumn('zipcode', get_zip_b(col("latitude"),col("longitude")))
2521/08/02 06:18:19 WARN TaskSetManager: Lost task 12.0 in stage 7.0 (TID 1814, ip-10-22-17-94.pclc0.merkle.local, executor 6): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
26 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/worker.py", line 605, in main
27 process()
28 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/worker.py", line 597, in process
29 serializer.dump_stream(out_iter, outfile)
30 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/serializers.py", line 223, in dump_stream
31 self.serializer.dump_stream(self._batched(iterator), stream)
32 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/serializers.py", line 141, in dump_stream
33 for obj in iterator:
34 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/serializers.py", line 212, in _batched
35 for item in iterator:
36 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/worker.py", line 450, in mapper
37 result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
38 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/worker.py", line 450, in <genexpr>
39 result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
40 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/worker.py", line 90, in <lambda>
41 return lambda *a: f(*a)
42 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/util.py", line 121, in wrapper
43 return f(*args, **kwargs)
44 File "/mnt/var/lib/hadoop/steps/s-1IBFS0SYWA19Z/Mobile_ID_process_center.py", line 102, in get_zip_b
45 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/broadcast.py", line 146, in value
46 self._value = self.load_from_path(self._path)
47 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/broadcast.py", line 123, in load_from_path
48 return self.load(f)
49 File "/mnt/yarn/usercache/hadoop/appcache/application_1627867699893_0001/container_1627867699893_0001_01_000009/pyspark.zip/pyspark/broadcast.py", line 129, in load
50 return pickle.load(file)
51AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks' from '/mnt/miniconda/lib/python3.9/site-packages/pandas/core/internals/blocks.py'>
52
Some observations and thought process:
1, After doing some search online, the AttributeError in pyspark seems to be caused by mismatched pandas versions between driver and workers?
2, But I ran the same code on two different datasets, one worked without any errors but the other didn't, which seems very strange and undeterministic, and it seems like the errors may not be caused by mismatched pandas versions. Otherwise, neither two datasets would succeed.
3, I then ran the same code on the successful dataset again, but this time with different spark configurations: setting spark.driver.memory from 2048M to 4192m, and it threw AttributeError.
4, In conclusion, I think the AttributeError has something to do with driver. But I can't tell how they are related from the error message, and how to fix it: AttributeError: Can't get attribute 'new_block' on <module 'pandas.core.internals.blocks'.
ANSWER
Answered 2021-Aug-26 at 14:53I had the same error using pandas 1.3.2 in the server while 1.2 in my client. Downgrading pandas to 1.2 solved the problem.
QUESTION
How to update pandas DataFrame.drop() for Future Warning - all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
Asked 2022-Feb-13 at 19:56The following code:
1df = df.drop('market', 1)
2
generates the warning:
FutureWarning: In a future version of pandas all arguments of DataFrame.drop except for the argument 'labels' will be keyword-only
market
is the column we want to drop, and we pass the 1
as a second parameter for axis (0 for index, 1 for columns, so we pass 1).
How can we change this line of code now so that it is not a problem in the future version of pandas / to resolve the warning message now?
ANSWER
Answered 2022-Feb-13 at 19:56From the documentation, pandas.DataFrame.drop
has the following parameters:
Parameters
labels: single label or list-like Index or column labels to drop.
axis: {0 or ‘index’, 1 or ‘columns’}, default 0 Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
index: single label or list-like Alternative to specifying axis (labels, axis=0 is equivalent to index=labels).
columns: single label or list-like Alternative to specifying axis (labels, axis=1 is equivalent to columns=labels).
level: int or level name, optional For MultiIndex, level from which the labels will be removed.
inplace: bool, default False If False, return a copy. Otherwise, do operation inplace and return None.
errors: {‘ignore’, ‘raise’}, default ‘raise’ If ‘ignore’, suppress error and only existing labels are dropped.
Moving forward, only labels
(the first parameter) can be positional.
So, for this example, the drop
code should be as follows:
1df = df.drop('market', 1)
2df = df.drop('market', axis=1)
3
or (more legibly) with columns
:
1df = df.drop('market', 1)
2df = df.drop('market', axis=1)
3df = df.drop(columns='market')
4
QUESTION
Cannot set up a conda environment with python 3.10
Asked 2022-Jan-31 at 10:35I am trying to set up a conda environment with python 3.10 installed. For some reason, no install commands for additional packages are working. For example, if I run conda install pandas
, I get the error:
1PackagesNotFoundError: The following packages are not available from current channels:
2
3 - python=3.1
4
conda install -c conda-forge pandas
doesn't work either. Not sure what the problem is.
ANSWER
Answered 2021-Oct-08 at 08:42Thats a bug in conda, you can read more about it here: https://github.com/conda/conda/issues/10969
Right now there is a PR to fix it but its not a released version. For now, just stick with
1PackagesNotFoundError: The following packages are not available from current channels:
2
3 - python=3.1
4conda install python=3.9
5
QUESTION
ImportError: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic'
Asked 2022-Jan-12 at 23:01I have this output :
[Pandas-profiling] ImportError: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic'
when trying to import pandas-profiling in this fashion :
1from pandas_profiling import ProfileReport
2
It seems to import pandas-profiling correctly but struggles when it comes to interfacing with pandas itself. Both libraries are currently up to date through conda. It doesn't seem to match any of the common problems associated with pandas-profiling as per their documentation, and I can't seem to locate a more general solution of importing the name ABCIndexClass
.
Thanks
ANSWER
Answered 2021-Aug-09 at 19:19Pandas v1.3 renamed the ABCIndexClass
to ABCIndex
. The visions
dependency of the pandas-profiling
package hasn't caught up yet, and so throws an error when it can't find ABCIndexClass
. Downgrading pandas to the 1.2.x series will resolve the issue.
Alternatively, you can just wait for the visions
package to be updated.
QUESTION
Merge two pandas DataFrame based on partial match
Asked 2022-Jan-06 at 00:54Two DataFrames have city names that are not formatted the same way. I'd like to do a Left-outer join and pull geo
field for all partial string matches between the field City
in both DataFrames.
1import pandas as pd
2
3df1 = pd.DataFrame({
4 'City': ['San Francisco, CA','Oakland, CA'],
5 'Val': [1,2]
6 })
7
8df2 = pd.DataFrame({
9 'City': ['San Francisco-Oakland, CA','Salinas, CA'],
10 'Geo': ['geo1','geo2']
11 })
12
Expected DataFrame
upon join:
1import pandas as pd
2
3df1 = pd.DataFrame({
4 'City': ['San Francisco, CA','Oakland, CA'],
5 'Val': [1,2]
6 })
7
8df2 = pd.DataFrame({
9 'City': ['San Francisco-Oakland, CA','Salinas, CA'],
10 'Geo': ['geo1','geo2']
11 })
12 City Val Geo
13
14 San Francisco, CA 1 geo1
15 Oakland, CA 2 geo1
16
ANSWER
Answered 2021-Sep-12 at 20:24This should do the job. String match with Levenshtein_distance.
pip install thefuzz[speedup]
1import pandas as pd
2
3df1 = pd.DataFrame({
4 'City': ['San Francisco, CA','Oakland, CA'],
5 'Val': [1,2]
6 })
7
8df2 = pd.DataFrame({
9 'City': ['San Francisco-Oakland, CA','Salinas, CA'],
10 'Geo': ['geo1','geo2']
11 })
12 City Val Geo
13
14 San Francisco, CA 1 geo1
15 Oakland, CA 2 geo1
16import pandas as pd
17import numpy as np
18
19from thefuzz import process
20
21def fuzzy_match(
22 a: pd.DataFrame, b: pd.DataFrame, col: str, limit: int = 5, thresh: int = 80
23):
24 """use fuzzy matching to join on column"""
25
26 s = b[col].tolist()
27
28 matches = a[col].apply(lambda x: process.extract(x, s, limit=limit))
29 matches = pd.DataFrame(np.concatenate(matches), columns=["match", "score"])
30
31 # join other columns in b to matches
32 to_join = (
33 pd.merge(left=b, right=matches, how="right", left_on="City", right_on="match")
34 .set_index( # create an index that represents the matching row in df a, you can drop this when `limit=1`
35 np.array(
36 list(
37 np.repeat(i, limit if limit < len(b) else len(b))
38 for i in range(len(a))
39 )
40 ).flatten()
41 )
42 .drop(columns=["match"])
43 .astype({"score": "int16"})
44 )
45 print(f"\t the index here represents the row in dataframe a on which to join")
46 print(to_join)
47
48 res = pd.merge(
49 left=a, right=to_join, left_index=True, right_index=True, suffixes=("", "_b")
50 )
51
52 # return only the highest match or you can just set the limit to 1
53 # and remove this
54 df = res.reset_index()
55 df = df.iloc[df.groupby(by="index")["score"].idxmax()].reset_index(drop=True)
56
57 return df.drop(columns=["City_b", "score", "index"])
58
59def test(df):
60
61 expected = pd.DataFrame(
62 {
63 "City": ["San Francisco, CA", "Oakland, CA"],
64 "Val": [1, 2],
65 "Geo": ["geo1", "geo1"],
66 }
67 )
68
69 print(f'{"expected":-^70}')
70 print(expected)
71
72 print(f'{"res":-^70}')
73 print(df)
74
75 assert expected.equals(df)
76
77
78if __name__ == "__main__":
79
80 a = pd.DataFrame({"City": ["San Francisco, CA", "Oakland, CA"], "Val": [1, 2]})
81 b = pd.DataFrame(
82 {"City": ["San Francisco-Oakland, CA", "Salinas, CA"], "Geo": ["geo1", "geo2"]}
83 )
84
85 print(f'\n\n{"fuzzy match":-^70}')
86 res = fuzzy_match(a, b, col="City")
87 test(res)
88
89
QUESTION
Create a new column in a Pandas DataFrame from existing column names
Asked 2021-Nov-15 at 00:22I want to deconstruct a pandas DataFrame, using column headers as a new data-column and create a list with all combinations of the row index and columns. Easier to show than explain:
1index_col = ["store1", "store2", "store3"]
2cols = ["January", "February", "March"]
3values = [[2,3,4],[5,6,7],[8,9,10]]
4df = pd.DataFrame(values, index=index_col, columns=cols)
5
From this DataFrame I wish to get the following list:
1index_col = ["store1", "store2", "store3"]
2cols = ["January", "February", "March"]
3values = [[2,3,4],[5,6,7],[8,9,10]]
4df = pd.DataFrame(values, index=index_col, columns=cols)
5[['store1', 'January', 2],
6 ['store1', 'February', 3],
7 ['store1', 'March', 4],
8 ['store2', 'January', 5],
9 ['store2', 'February', 6],
10 ['store2', 'March', 7],
11 ['store3', 'January', 8],
12 ['store3', 'February', 9],
13 ['store3', 'March', 10]]
14
Is there a convenient way to do this?
ANSWER
Answered 2021-Nov-09 at 23:58The structure that you want your data in is very messy, so this is probably the best method given the data you want.
1index_col = ["store1", "store2", "store3"]
2cols = ["January", "February", "March"]
3values = [[2,3,4],[5,6,7],[8,9,10]]
4df = pd.DataFrame(values, index=index_col, columns=cols)
5[['store1', 'January', 2],
6 ['store1', 'February', 3],
7 ['store1', 'March', 4],
8 ['store2', 'January', 5],
9 ['store2', 'February', 6],
10 ['store2', 'March', 7],
11 ['store3', 'January', 8],
12 ['store3', 'February', 9],
13 ['store3', 'March', 10]]
14# Results
15res = []
16
17# Nested loop: first for length of index col, then next for cols
18for i in range(len(index_col)):
19 for j in range(len(cols)):
20 # Format of data
21 res.append([index_col[i], cols[j], values[i][j]])
22
23# Return results
24print(res)
25return res
26
QUESTION
After conda update, python kernel crashes when matplotlib is used
Asked 2021-Nov-06 at 19:03I have create this simple env with conda
:
1conda create -n test python=3.8.5 pandas scipy numpy matplotlib seaborn jupyterlab
2
The following code in jupyter lab
crashes the kernel :
1conda create -n test python=3.8.5 pandas scipy numpy matplotlib seaborn jupyterlab
2import matplotlib.pyplot as plt
3
4plt.subplot()
5
I don't face the problem on Linux
. The problem is when I try on Windows 10
.
There are no errors on the jupyter lab
console (where I started the server), and I have no idea where to investigate.
ANSWER
Answered 2021-Nov-06 at 19:03- The default
pkgs/main
channel forconda
has reverted to usingfreetype 2.10.4
for Windows, per main / packages / freetype. - If you are still experiencing the issue, use
conda list freetype
to check the version:freetype != 2.11.0
- If it is 2.11.0, then change the version per the solution, or
conda update --all
(providing your default channel isn't changed in the.condarc
config file).
- If it is 2.11.0, then change the version per the solution, or
- If this is occurring after installing Anaconda, updating
conda
orfreetype
since Oct 27, 2021. - Go to the
Anaconda
prompt and downgradefreetype 2.11.0
in any affected environment.conda install freetype=2.10.4
- Relevant to any package using
matplotlib
and any IDE- For example,
pandas.DataFrame.plot
andseaborn
- Jupyter, Spyder, VSCode, PyCharm, command line.
- For example,
Discovery
- An issue occurs after updating with the most current updates from
conda
, released Friday, Oct 29. - After updating with
conda update --all
, there's an issue with anything related tomatplotlib
in any IDE (not justJupyter
).- I tested this in
JupyterLab
,PyCharm
, andpython
from the command prompt. - PyCharm:
Process finished with exit code -1073741819
- JupyterLab: kernel just restarts and there are no associated errors or Traceback
- command prompt: a blank interactive matplotlib window will appear briefly, and then a new command line appears.
- I tested this in
- The issue seems to be with
conda update --all
in(base)
, then any plot API that usesmatplotlib
(e.g.seaborn
andpandas.DataFrame.plot
) kills the kernel in any environment. - I had to reinstall Anaconda, but do not do an update of
(base)
, then my other environments worked. - I have not figured out what specifically is causing the issue.
- I tested the issue with
python 3.8.12
andpython 3.9.7
- Current Testing:
- Following is the
conda
revision log. - Prior to
conda update --all
this environment was working, but after the updates, plotting withmatplotlib
crashes the python kernel
- Following is the
1conda create -n test python=3.8.5 pandas scipy numpy matplotlib seaborn jupyterlab
2import matplotlib.pyplot as plt
3
4plt.subplot()
5 2021-10-31 10:47:22 (rev 3)
6 bokeh {2.3.3 (defaults/win-64) -> 2.4.1 (defaults/win-64)}
7 click {8.0.1 (defaults/noarch) -> 8.0.3 (defaults/noarch)}
8 filelock {3.0.12 (defaults/noarch) -> 3.3.1 (defaults/noarch)}
9 freetype {2.10.4 (defaults/win-64) -> 2.11.0 (defaults/win-64)}
10 imagecodecs {2021.6.8 (defaults/win-64) -> 2021.8.26 (defaults/win-64)}
11 joblib {1.0.1 (defaults/noarch) -> 1.1.0 (defaults/noarch)}
12 lerc {2.2.1 (defaults/win-64) -> 3.0 (defaults/win-64)}
13 more-itertools {8.8.0 (defaults/noarch) -> 8.10.0 (defaults/noarch)}
14 pyopenssl {20.0.1 (defaults/noarch) -> 21.0.0 (defaults/noarch)}
15 scikit-learn {0.24.2 (defaults/win-64) -> 1.0.1 (defaults/win-64)}
16 statsmodels {0.12.2 (defaults/win-64) -> 0.13.0 (defaults/win-64)}
17 sympy {1.8 (defaults/win-64) -> 1.9 (defaults/win-64)}
18 tqdm {4.62.2 (defaults/noarch) -> 4.62.3 (defaults/noarch)}
19 xlwings {0.24.7 (defaults/win-64) -> 0.24.9 (defaults/win-64)}
20
- The issue seems to be
freetype
- Downgrading from
2.11.0
to2.10.4
resolved the issue and made the environment work withmatplotlib
- Downgrading from
- Went to post a bug report and discovered there is [Bug]: Matplotlib crashes Python #21511
Community Discussions contain sources that include Stack Exchange Network