widyr | Widen , process , and re-tidy a dataset | Development Tools library
kandi X-RAY | widyr Summary
kandi X-RAY | widyr Summary
This package wraps the pattern of un-tidying data into a wide matrix, performing some processing, then turning it back into a tidy form. This is useful for several mathematical operations such as co-occurrence counts, correlations, or clustering that are best done on a wide matrix.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of widyr
widyr Key Features
widyr Examples and Code Snippets
Community Discussions
Trending Discussions on widyr
QUESTION
I am trying to compute the pairwise similarity between accounts using similar hashtags over time.
I have code (below) that gives me the pairwise similarity between accounts for the most recent 300 tweets sent by each account. However, I would like to compute the pairwise similarity between accounts for specific slices of time (day, week, month). How can I do that?
...ANSWER
Answered 2021-Nov-08 at 18:11Using group_by()
should work:
QUESTION
I have some data on published papers that looks like this:
...ANSWER
Answered 2021-Sep-29 at 04:36Here's a base R option -
Create a pairwise combination with combn
and use tapply
to count how many paper
s have the combination in them
QUESTION
I am new to shiny/flexdashboard and so far have been able to render plots and filter dataframe by using values from selectInput
with help of req(input$user_input_value)
.
ISSUE: To run kmeans
I am taking user input for number of clusters which I am not able to code it in reactive format and getting error: object of type closure is not subsettable.
ANSWER
Answered 2021-Mar-28 at 19:30The problem is in a reactive chunk. The reactive expression km.res uses an input number of clusters, runs a model, and saves the output. (and let's end the code chunk here).
Next, decide what do you want to do with the output?
- to print the result, use renderPrint
- to show as a plot, use renderPlot,
- to show as a table, user renderTable, etc.
Now Let's print the output of the model with renderPrint() the output can be accessed by calling the expression’s name followed by parenthesis, e.g., km.res()
QUESTION
I want to cluster words that are similar using R and the tidytext
package.
I have created my tokens and would now like to convert it to a matrix in order to cluster it. I would like to try out a number of token techniques to see which provides the most compact clusters.
My code is as follows (taken from the docs of widyr
package). I just cant make the next step. Can anyone help?
ANSWER
Answered 2021-Feb-07 at 17:59You can create an appropriate matrix for this via casting from tidytext. There are several functions to cast_
, such as cast_sparse()
.
Let's use four example books, and cluster the chapters within the books:
QUESTION
I am trying to calculate how many species are shared between region pairs. So if species A, B, and C all occur in, say, the US and Canada, but species D only occurs in Canada, then the number of shared species for the US-Canada pair == 3. I've used widyr() package in R to do this on a large dataset (>10,000 species in 50 countries), but I'm getting two errors:
- If I analyze the whole dataset, some region pairs are missing, whereas if I just subset the two regions, then it works (gives the number of shared species for that pair). and 2) I tried with just a subset of the regions, and I get different answers for how many species are shared, compared to using the full dataset.
I've seen some similar posts about calculating the distance between pairs, and unique pairs, but it's not quite what I'm trying to do. I've never used widyr before and I can't figure out what the mistake is - I'm wondering if it's something with "first" and "last"?
...ANSWER
Answered 2021-Feb-02 at 09:59Never used widyr
. You can try something like this:
QUESTION
I would like to understand the practical differences of following cases:
- Use function
fcm(objectname # generate feature co-occurrence matrix
to calculate the absolute frequenies. Finally plot with functiontextplot_network()
. - I read tutorials like tidytextmining or a tutorial written by Andreas Niekler and Gregor Wiedemann who use igraph or widyr package. I want to plot correlated word pairs. Inspirated by tidytextmining tutorial which use the phi coefficient I will plot this correlation according the lambda coefficient.
I don't know how to plot the correlated word pairs with package quanteda.
My idea is (maybe is not an efficient way) to compute
textstat_collocations()
and transform it to a tibble object and plot it with the functions of the widyr package.
My open questions are:
How can I split column collocation into two separate columns like item1 item2 and
add select column lambda and save it and assign to a tibble object?
ANSWER
Answered 2020-Oct-04 at 13:52Like this? Remove the select()
command if you prefer to keep all of the columns.
QUESTION
When I unnest_tokens for a list I enter manually; the output includes the row number each word came from.
...ANSWER
Answered 2020-May-19 at 14:10I guess if you import the text using text <- read.csv("TextSample.csv", stringsAsFactors=FALSE)
, text is a data frame while if you enter it manually it is a vector.
If you would alter the code to: text_df <- tibble(text = text$col_name)
to select the column from the data frame (which is a vector) in the csv case, I think you should get the same result as before.
QUESTION
I try to calculate the correlation between articles to get an indication of how often different article numbers appeared together in a document (invoice).
I have a Table from SQL query with two columns: Document Number, Article Number The Table is quite huge with 21k lines.
I have 5k document numbers and 700 different articles like the sample shown below. Its a data frame "db_belege".
...ANSWER
Answered 2020-Mar-10 at 15:51If you want to bring the data into a proper format to use cor
, we can use tidyr
's pivot_wider
and then convert into a matrix:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install widyr
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page