kandi X-RAY | QQMusic Summary
kandi X-RAY | QQMusic Summary
高仿QQ音乐
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of QQMusic
QQMusic Key Features
QQMusic Examples and Code Snippets
Community Discussions
Trending Discussions on QQMusic
QUESTION
I was thinking to create a dummy variable dataset in Pyspark, and has loaded data from a hive table and stored data in RDD format.
data size is 20000000*11(row * column)
(on cluster) RDD converted hive table into a nested list and I am confused about the way of processing it (different from python).
Question:is there any way to maintain the first two variables(id and lable) and insert a new variable to identiy label (as nominal vairable e.g. group_1, group_2, based on conditional assignment) before getting a dummy variable coding (based on the rest of variables)
I have tried to use key-value method, but it is not working. any thoughts would be appreciated.
desired result(dummuy coding) the length of dummy encoding (0,1) is determined by unique elements of all vaiables across rows.
[u'007896797eed11ba73dd', u'18-24', 0, 0, 0,0,0,0,0,1,1,1,1]
for example, the above user did not install ins_com.meitu.meipaimv so it would be 0 and if ins_com.meitu.xxxx is installed then it is 1
example data:
[[u'007896797eed11ba73dd', u'18-24', u'ins_com.meitu.meipaimv,ins_com.babytree.apps.pregnancy,ins_com.sankuai.meituan,ins_cn.damai,ins_com.google.android.gms,ins_com.taobao.taobao,ins_com.sina.weibo,ins_com.google.android.syncadapters.calendar,ins_com.tencent.qqmusic,ins_com.tencent.mm,ins_com.lemon.faceu,ins_com.zhihu.android,ins_com.Qunar,ins_com.eg.android.AlipayGphone,ins_com.airbnb.android,ins_com.lingan.seeyou,ins_com.qicai.translate,ins_com.mt.mtxx.mtxx,ins_vz.com,ins_com.ganji.android,ins_com.google.android.gsf,ins_com.taobao.trip,ins_com.mfw.roadbook,ins_com.tencent.mobileqq', u'act_cn.damai,act_com.taobao.trip,act_com.sankuai.meituan,act_com.google.android.gms,act_com.eg.android.AlipayGphone,act_com.tencent.mm,act_com.sina.weibo,act_com.babytree.apps.pregnancy,act_com.taobao.taobao,act_com.meitu.meipaimv,act_com.mfw.roadbook,act_com.zhihu.android,act_com.mt.mtxx.mtxx', u'inst_ct_21_40', u'installed1', u'inst_cate_ct_13_16', u'active_ct_10_20', u'activ_1', u'phone_price_2500_3500', u''], [u'4ac74594b0fe17b532e7f278', u'25-34', u'ins_com.easysay.japanese,ins_com.tencent.mobileqq,ins_com.eg.android.AlipayGphone,ins_com.google.android.syncadapters.calendar,ins_com.zhaopin.social,ins_com.taobao.taobao,ins_com.kugou.android,ins_com.sina.weibo,ins_com.tencent.qqlive,ins_cmb.pb,ins_com.android.browser,ins_com.baidu.searchbox,ins_com.sohu.inputmethod.sogou,ins_cn.wps.moffice_eng,ins_com.qiyi.video,ins_com.tencent.mm,ins_com.autonavi.minimap,ins_com.luojilab.player,ins_com.liulishuo.engzo', u'act_com.zhaopin.social,act_com.tencent.mm,act_com.sina.weibo', u'inst_ct_0_20', u'installed12,installed9', u'inst_cate_ct_9_12', u'active_ct_4_5', u'activ_12', u'phone_price_801_1500', u'']]
ANSWER
Answered 2017-Dec-01 at 16:17First finding the full set of elements and fixing the order.
Then maping the data to this set.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install QQMusic
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page