BullshitGenerator | Needs to generate some texts to test if my GUI rendering | Data Manipulation library

 by   menzi11 JavaScript Version: Current License: Non-SPDX

kandi X-RAY | BullshitGenerator Summary

BullshitGenerator is a JavaScript library typically used in Utilities, Data Manipulation, React applications. BullshitGenerator has no bugs, it has no vulnerabilities and it has medium support. However BullshitGenerator has a Non-SPDX License. You can download it from GitHub.
偶尔需要一些中文文字用于GUI开发时测试文本渲染. 本项目只做这一项, 请勿用于其他任何用途. Needs to generate some texts to test if my GUI rendering codes good or not. so I made this.
    Support
      Quality
        Security
          License
            Reuse
            Support
              Quality
                Security
                  License
                    Reuse

                      kandi-support Support

                        summary
                        BullshitGenerator has a medium active ecosystem.
                        summary
                        It has 15752 star(s) with 2962 fork(s). There are 275 watchers for this library.
                        summary
                        It had no major release in the last 6 months.
                        summary
                        There are 120 open issues and 34 have been closed. On average issues are closed in 85 days. There are 36 open pull requests and 0 closed requests.
                        summary
                        It has a neutral sentiment in the developer community.
                        summary
                        The latest version of BullshitGenerator is current.
                        BullshitGenerator Support
                          Best in #Data Manipulation
                            Average in #Data Manipulation
                            BullshitGenerator Support
                              Best in #Data Manipulation
                                Average in #Data Manipulation

                                  kandi-Quality Quality

                                    summary
                                    BullshitGenerator has 0 bugs and 0 code smells.
                                    BullshitGenerator Quality
                                      Best in #Data Manipulation
                                        Average in #Data Manipulation
                                        BullshitGenerator Quality
                                          Best in #Data Manipulation
                                            Average in #Data Manipulation

                                              kandi-Security Security

                                                summary
                                                BullshitGenerator has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
                                                summary
                                                BullshitGenerator code analysis shows 0 unresolved vulnerabilities.
                                                summary
                                                There are 0 security hotspots that need review.
                                                BullshitGenerator Security
                                                  Best in #Data Manipulation
                                                    Average in #Data Manipulation
                                                    BullshitGenerator Security
                                                      Best in #Data Manipulation
                                                        Average in #Data Manipulation

                                                          kandi-License License

                                                            summary
                                                            BullshitGenerator has a Non-SPDX License.
                                                            summary
                                                            Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.
                                                            BullshitGenerator License
                                                              Best in #Data Manipulation
                                                                Average in #Data Manipulation
                                                                BullshitGenerator License
                                                                  Best in #Data Manipulation
                                                                    Average in #Data Manipulation

                                                                      kandi-Reuse Reuse

                                                                        summary
                                                                        BullshitGenerator releases are not available. You will need to build from source code and install.
                                                                        BullshitGenerator Reuse
                                                                          Best in #Data Manipulation
                                                                            Average in #Data Manipulation
                                                                            BullshitGenerator Reuse
                                                                              Best in #Data Manipulation
                                                                                Average in #Data Manipulation
                                                                                  Top functions reviewed by kandi - BETA
                                                                                  kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
                                                                                  Currently covering the most popular Java, JavaScript and Python libraries. See a Sample Here
                                                                                  Get all kandi verified functions for this library.
                                                                                  Get all kandi verified functions for this library.

                                                                                  BullshitGenerator Key Features

                                                                                  Needs to generate some texts to test if my GUI rendering codes good or not. so I made this.

                                                                                  BullshitGenerator Examples and Code Snippets

                                                                                  No Code Snippets are available at this moment for BullshitGenerator.
                                                                                  Community Discussions

                                                                                  Trending Discussions on Data Manipulation

                                                                                  R: Is there a "Un-Character" Command in R?
                                                                                  chevron right
                                                                                  Creating new columns based on data in row separated by specific character in R
                                                                                  chevron right
                                                                                  Multiplying and Adding Values across Rows
                                                                                  chevron right
                                                                                  How to make a rank column in R
                                                                                  chevron right
                                                                                  How to return the column title wherein the row contains the greatest value in Pandas Dataframe
                                                                                  chevron right
                                                                                  Split large csv file into multiple files based on column(s)
                                                                                  chevron right
                                                                                  Get the first non-null value from selected cells in a row
                                                                                  chevron right
                                                                                  pivot_longer with column pairs
                                                                                  chevron right
                                                                                  Simulating Random Draws From a "Hat"
                                                                                  chevron right
                                                                                  Break Apart a String into Separate Columns R
                                                                                  chevron right

                                                                                  QUESTION

                                                                                  R: Is there a "Un-Character" Command in R?
                                                                                  Asked 2022-Apr-10 at 17:37

                                                                                  I am working with the R programming language.

                                                                                  I have the following dataset:

                                                                                  v <- c(1,2,3,4,5,6,7,8,9,10)
                                                                                  
                                                                                  var_1 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))
                                                                                  
                                                                                  var_2 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))
                                                                                  
                                                                                  var_3 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))
                                                                                  
                                                                                  var_4 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))
                                                                                  
                                                                                  var_5 <- as.factor(sample(v, 10000, replace=TRUE, prob=c(0.1,0.1,0.1,0.1,0.1, 0.1,0.1,0.1,0.1,0.1)))
                                                                                  
                                                                                  my_data = data.frame(var_1, var_2, var_3, var_4, var_5)
                                                                                  

                                                                                  I also have another dataset of "conditions" that will be used for querying this data frame:

                                                                                  conditions = data.frame(cond_1 = c("1,3,4", "4,5,6"), cond_2 = c("5,6", "7,8,9"))
                                                                                  

                                                                                  My Question: I tried to run the following command to select rows from "my_data" based on the first row of "conditions" - but this returns an empty result:

                                                                                  my_data[my_data$var_1 %in% unlist(conditions[1,1]) &
                                                                                              my_data$var_2 %in% unlist(conditions[1,2]), ]
                                                                                  
                                                                                  [1] var_1 var_2 var_3 var_4 var_5
                                                                                  <0 rows> (or 0-length row.names)
                                                                                  

                                                                                  I tried to look more into this by "inspecting" these conditions:

                                                                                  class(conditions[1,1])
                                                                                  [1] "character"
                                                                                  

                                                                                  This makes me think that the "unlist()" command is not working because the conditions themselves are a "character" instead of a "list".

                                                                                  Is there an equivalent command that can be used here that plays the same role as the "unlist()" command so that the above statement can be run?

                                                                                  In general, I am trying to produce the same results as I would have gotten from this code - but keeping the format I was using above:

                                                                                  my_data[my_data$var_1 %in% c("1", "3", "4") &
                                                                                              my_data$var_2 %in% c("5", "6"), ]
                                                                                  

                                                                                  ANSWER

                                                                                  Answered 2022-Apr-10 at 05:36

                                                                                  Up front, "1,3,4" != 1. It seems you should look to split the strings using strsplit(., ",").

                                                                                  expected <- my_data[my_data$var_1 %in% c("1", "3", "4") & my_data$var_2 %in% c("5", "6"), ]
                                                                                  head(expected)
                                                                                  #     var_1 var_2 var_3 var_4 var_5
                                                                                  # 18      3     6     2     2     9
                                                                                  # 129     3     5     3     2     8
                                                                                  # 133     4     5     6     5     8
                                                                                  # 186     1     6     6    10    10
                                                                                  # 204     4     6     4     2     6
                                                                                  # 207     1     5     3     2     9
                                                                                  
                                                                                  out <- my_data[do.call(`&`, 
                                                                                    Map(`%in%`,
                                                                                        lapply(my_data[,1:2], as.character), 
                                                                                        lapply(conditions, function(z) strsplit(z, ",")[[1]]))),]
                                                                                  head(out)
                                                                                  #     var_1 var_2 var_3 var_4 var_5
                                                                                  # 18      3     6     2     2     9
                                                                                  # 129     3     5     3     2     8
                                                                                  # 133     4     5     6     5     8
                                                                                  # 186     1     6     6    10    10
                                                                                  # 204     4     6     4     2     6
                                                                                  # 207     1     5     3     2     9
                                                                                  

                                                                                  Edit: update for new conditions: change do.call to Reduce:

                                                                                  conditions = data.frame(cond_1 = c("1,3,4", "4,5,6"), cond_2 = c("5,6", "7,8,9"), cond_3 = c("4,6", "9"))
                                                                                  out <- my_data[Reduce(`&`,
                                                                                    Map(`%in%`,
                                                                                        lapply(my_data[,1:3], as.character),
                                                                                        lapply(conditions, function(z) strsplit(z, ",")[[1]]))),]
                                                                                  head(out)
                                                                                  #     var_1 var_2 var_3 var_4 var_5
                                                                                  # 133     4     5     6     5     8
                                                                                  # 186     1     6     6    10    10
                                                                                  # 204     4     6     4     2     6
                                                                                  # 232     1     5     6     5     8
                                                                                  # 332     3     6     6     5    10
                                                                                  # 338     1     5     6     3     6
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71813866

                                                                                  QUESTION

                                                                                  Creating new columns based on data in row separated by specific character in R
                                                                                  Asked 2022-Mar-15 at 08:48

                                                                                  I've the following table

                                                                                  Owner Pet Housing_Type A Cats;Dog;Rabbit 3 B Dog;Rabbit 2 C Cats 2 D Cats;Rabbit 3 E Cats;Fish 1

                                                                                  The code is as follows:

                                                                                  Data_Pets = structure(list(Owner = structure(1:5, .Label = c("A", "B", "C", "D",
                                                                                   "E"), class = "factor"), Pets = structure(c(2L, 5L, 1L,4L, 3L), .Label = c("Cats ",
                                                                                   "Cats;Dog;Rabbit", "Cats;Fish","Cats;Rabbit", "Dog;Rabbit"), class = "factor"), 
                                                                                  House_Type = c(3L,2L, 2L, 3L, 1L)), class = "data.frame", row.names = c(NA, -5L))
                                                                                  

                                                                                  Can anyone advise me how I can create new columns based on the data in Pet column by creating a new column for each animal separated by ; to look like the following table?

                                                                                  Owner Cats Dog Rabbit Fish Housing_Type A Y Y Y N 3 B N Y Y N 2 C N Y N N 2 D Y N Y N 3 E Y N N Y 1

                                                                                  Thanks!

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-15 at 08:48

                                                                                  One approach is to define a helper function that matches for a specific animal, then bind the columns to the original frame.

                                                                                  Note that some wrangling is done to get rid of whitespace to identify the unique animals to query.

                                                                                  f <- Vectorize(function(string, match) {
                                                                                    ifelse(grepl(match, string), "Y", "N")
                                                                                  }, c("match"))
                                                                                  
                                                                                  df %>%
                                                                                    bind_cols(
                                                                                      f(df$Pets, unique(unlist(strsplit(trimws(as.character(df$Pets)), ";"))))
                                                                                    )
                                                                                  
                                                                                    Owner            Pets House_Type Cats Dog Rabbit Fish
                                                                                  1     A Cats;Dog;Rabbit          3    Y   Y      Y    N
                                                                                  2     B      Dog;Rabbit          2    N   Y      Y    N
                                                                                  3     C           Cats           2    Y   N      N    N
                                                                                  4     D     Cats;Rabbit          3    Y   N      Y    N
                                                                                  5     E       Cats;Fish          1    Y   N      N    Y
                                                                                  

                                                                                  Or more generalized if you don't know for sure that the separator is ;, and whitespace is present, stringi is useful:

                                                                                  dplyr::bind_cols(
                                                                                    df,
                                                                                    f(df$Pets, unique(unlist(stringi::stri_extract_all_words(df$Pets))))
                                                                                  )
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71478316

                                                                                  QUESTION

                                                                                  Multiplying and Adding Values across Rows
                                                                                  Asked 2022-Mar-10 at 08:24

                                                                                  I have this data frame:

                                                                                  color <- c("AKZ", "ZZA", "KAK")    
                                                                                  color_1 <- sample(color, 100, replace=TRUE, prob=c(0.4, 0.3, 0.3))
                                                                                  id = 1:100
                                                                                  
                                                                                  sample_data = data.frame(id, color_1)
                                                                                  
                                                                                  
                                                                                   id color_1
                                                                                  1  1     KAK
                                                                                  2  2     AKZ
                                                                                  3  3     KAK
                                                                                  4  4     KAK
                                                                                  5  5     AKZ
                                                                                  6  6     ZZA
                                                                                  

                                                                                  Suppose there is a legend:

                                                                                  • K = 3
                                                                                  • A = 4
                                                                                  • Z = 6

                                                                                  I want to add two columns to the above data frame:

                                                                                  • sample_data$add_score : e.g. KAK = K + A + K = 3 + 4 + 3 = 10
                                                                                  • sample_data$multiply_score : e.g. KAK = K * A * K = 3 * 4 * 3 = 36

                                                                                  I thought of solving the problem like this:

                                                                                  sample_data$first = substr(color_1,1,1)
                                                                                  sample_data$second = substr(color_1,2,2)
                                                                                  sample_data$third = substr(color_1,3,3)
                                                                                  
                                                                                  sample_data$first_score = ifelse(sample_data$first == "K", 3, ifelse(sample_data$first == "A", 4, 6))
                                                                                   
                                                                                  sample_data$second_score = ifelse(sample_data$second == "K", 3, ifelse(sample_data$second == "A", 4, 6))
                                                                                  
                                                                                  sample_data$third_score = ifelse(sample_data$third == "K", 3, ifelse(sample_data$third == "A", 4, 6))
                                                                                  
                                                                                  sample_data$add_score = sample_data$first_score + sample_data$second_score + sample_data$third_score
                                                                                  
                                                                                  sample_data$multiply_score = sample_data$first_score * sample_data$second_score * sample_data$third_score
                                                                                  

                                                                                  But I think this way would take a long time if the length of "color_1" was longer. Given a scoring legend, is there a faster way to do this?

                                                                                  Thank you!

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-10 at 04:12

                                                                                  We can use stri_replace_all_regex to replace your color_1 into integers together with the arithmetic operator.

                                                                                  Here I've stored your values into a vector color_1_convert. We can use this as the input in stri_replace_all_regex for better management of the values.

                                                                                  library(dplyr)
                                                                                  library(stringi)
                                                                                  
                                                                                  color_1_convert <- c("K" = "3", "A" = "4", "Z" = "6")
                                                                                  
                                                                                  sample_data %>%
                                                                                    group_by(id) %>%
                                                                                    mutate(add_score = eval(parse(text = gsub("\\+$", "", stri_replace_all_regex(color_1, names(color_1_convert), paste0(color_1_convert, "+"), vectorize_all = F)))),
                                                                                           multiply_score = eval(parse(text = gsub("\\*$", "", stri_replace_all_regex(color_1, names(color_1_convert), paste0(color_1_convert, "*"), vectorize_all = F)))))
                                                                                  
                                                                                  # A tibble: 100 × 4
                                                                                  # Groups:   id [100]
                                                                                        id color_1 add_score multiply_score
                                                                                                       
                                                                                   1     1 KAK            10             36
                                                                                   2     2 ZZA            16            144
                                                                                   3     3 AKZ            13             72
                                                                                   4     4 ZZA            16            144
                                                                                   5     5 AKZ            13             72
                                                                                   6     6 AKZ            13             72
                                                                                   7     7 AKZ            13             72
                                                                                   8     8 KAK            10             36
                                                                                   9     9 ZZA            16            144
                                                                                  10    10 AKZ            13             72
                                                                                  # … with 90 more rows
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71418533

                                                                                  QUESTION

                                                                                  How to make a rank column in R
                                                                                  Asked 2022-Mar-07 at 16:19

                                                                                  I have a database with columns M1, M2 and M3. These M values correspond to the values obtained by each method. My idea is now to make a rank column for each of them. For M1 and M2, the rank will be from the highest value to the lowest value and M3 in reverse. I made the output table for you to see.

                                                                                  df1<-structure(list(M1 = c(400,300, 200, 50), M2 = c(500,200, 10, 100), M3 = c(420,330, 230, 51)), class = "data.frame", row.names = c(NA,-4L))
                                                                                  
                                                                                  > df1
                                                                                     M1  M2  M3
                                                                                  1 400 500 420
                                                                                  2 300 200 330
                                                                                  3 200 10 230
                                                                                  4  50 100  51
                                                                                  

                                                                                  Output

                                                                                  > df1
                                                                                     M1  rank M2  rank M3 rank
                                                                                  1 400   1   500  1   420  4    
                                                                                  2 300   2   200  2   330  3
                                                                                  3 200   3   10   4   230  2
                                                                                  4  50   4   100  3   51   1
                                                                                  

                                                                                  Adjust rankings:

                                                                                  I used the code, but in a case I'm working on, my rankings looked like this:

                                                                                  ANSWER

                                                                                  Answered 2022-Mar-07 at 14:15

                                                                                  Using rank and relocate:

                                                                                  library(dplyr)
                                                                                  
                                                                                  df1 %>% 
                                                                                    mutate(across(M1:M2, ~ rank(-.x), .names = "{.col}_rank"),
                                                                                           M3_rank = rank(M3)) %>% 
                                                                                    relocate(order(colnames(.)))
                                                                                  
                                                                                     M1 M1_rank  M2 M2_rank  M3 M3_rank
                                                                                  1 400       1 500       1 420       4
                                                                                  2 300       2 200       2 330       3
                                                                                  3 200       3  10       4 230       2
                                                                                  4  50       4 100       3  51       1
                                                                                  

                                                                                  If you have duplicate values in your vector, then you have to choose a method for ties. By default, you get the average rank, but you can choose "first".

                                                                                  Another possibility, which is I think what you want to do, is to convert to factor and then to numeric, so that you get a only entire values (not the average).

                                                                                  df1 <- data.frame(M1 = c(400,300, 50, 300))
                                                                                  df1 %>% 
                                                                                    mutate(M1_rankAverage = rank(-M1),
                                                                                           M1_rankFirst = rank(-M1, ties.method = "first"),
                                                                                           M1_unique = as.numeric(as.factor(rank(-M1))))
                                                                                  
                                                                                     M1 M1_rankAverage M1_rankFirst M1_unique
                                                                                  1 400            1.0            1         1
                                                                                  2 300            2.5            2         2
                                                                                  3  50            4.0            4         3
                                                                                  4 300            2.5            3         2
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71381995

                                                                                  QUESTION

                                                                                  How to return the column title wherein the row contains the greatest value in Pandas Dataframe
                                                                                  Asked 2022-Feb-24 at 20:56

                                                                                  I working on a Python project that has a DataFrame like this:

                                                                                  data = {'AAA':  [3, 8, 2, 1],
                                                                                          'BBB':  [5, 4, 7, 2],
                                                                                          'CCC':  [2, 5, 6, 4]}
                                                                                  df = pd.DataFrame(data)
                                                                                  

                                                                                  which leads to:

                                                                                  AAA BBB CCC 0 3 5 2 1 8 4 5 2 2 7 6 3 1 2 4

                                                                                  And the task consists of generating the following DataFrame:

                                                                                  AAA BBB CCC Role 0 3 5 2 BBB 1 8 4 5 AAA 2 2 7 6 BBB 3 1 2 4 CCC

                                                                                  Where "Role" column elements are the column headers that have the highest value in the row in which it is located.

                                                                                  Could you please help me by suggesting a code that solves this task?

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-24 at 20:48

                                                                                  You could use the idxmax method on axis:

                                                                                  df['Role'] = df.idxmax(axis=1)
                                                                                  

                                                                                  Output:

                                                                                     AAA  BBB  CCC  Role
                                                                                  0    3    5    2  BBB
                                                                                  1    8    4    5  AAA
                                                                                  2    2    7    6  BBB
                                                                                  3    1    2    4  CCC
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/71258033

                                                                                  QUESTION

                                                                                  Split large csv file into multiple files based on column(s)
                                                                                  Asked 2022-Feb-07 at 12:49

                                                                                  I would like to know of a fast/efficient way in any program (awk/perl/python) to split a csv file (say 10k columns) into multiple small files each containing 2 columns. I would be doing this on a unix machine.

                                                                                  #contents of large_file.csv
                                                                                  1,2,3,4,5,6,7,8
                                                                                  a,b,c,d,e,f,g,h
                                                                                  q,w,e,r,t,y,u,i
                                                                                  a,s,d,f,g,h,j,k
                                                                                  z,x,c,v,b,n,m,z
                                                                                  

                                                                                  I now want multiple files like this:

                                                                                  # contents of 1.csv
                                                                                  1,2
                                                                                  a,b
                                                                                  q,w
                                                                                  a,s
                                                                                  z,x
                                                                                  
                                                                                  # contents of 2.csv
                                                                                  1,3
                                                                                  a,c
                                                                                  q,e
                                                                                  a,d
                                                                                  z,c
                                                                                  
                                                                                  # contents of 3.csv
                                                                                  1,4
                                                                                  a,d
                                                                                  q,r
                                                                                  a,f
                                                                                  z,v
                                                                                  
                                                                                  and so on...
                                                                                  

                                                                                  I can do this currently with awk on small files (say 30 columns) like this:

                                                                                  awk -F, 'BEGIN{OFS=",";} {for (i=1; i < NF; i++) print $1, $(i+1) > i ".csv"}' large_file.csv
                                                                                  

                                                                                  The above takes a very long time with large files and I was wondering if there is a faster and more efficient way of doing the same.

                                                                                  Thanks in advance.

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-12 at 05:22

                                                                                  With your show samples, attempts; please try following awk code. Since you are opening files all together it may fail with infamous "too many files opened error" So to avoid that have all values into an array and in END block of this awk code print them one by one and I am closing them ASAP all contents are getting printed to output file.

                                                                                  awk '
                                                                                  BEGIN{ FS=OFS="," }
                                                                                  {
                                                                                    for(i=1;i (outFile)
                                                                                      close(outFile)
                                                                                    }
                                                                                  }
                                                                                  ' large_file.csv
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70320648

                                                                                  QUESTION

                                                                                  Get the first non-null value from selected cells in a row
                                                                                  Asked 2022-Feb-04 at 09:55

                                                                                  Good afternoon, friends!

                                                                                  I'm currently performing some calculations in R (df is displayed below). My goal is to display in a new column the first non-null value from selected cells for each row.

                                                                                  My df is:

                                                                                  MD <- c(100, 200, 300, 400, 500)
                                                                                  liv <- c(0, 0, 1, 3, 4)
                                                                                  liv2 <- c(6, 2, 0, 4, 5)
                                                                                  liv3 <- c(1, 1, 1, 1, 1)
                                                                                  liv4 <- c(1, 0, 0, 3, 5)
                                                                                  liv5 <- c(0, 2, 7, 9, 10)
                                                                                  
                                                                                  df <- data.frame(MD, liv, liv2, liv3, liv4, liv5)
                                                                                  

                                                                                  I want to display (in a column called "liv6") the first non-null value from 5 cells (given the data, liv1 = 0, liv2 = 6 , liv3 = 1, liv 4 = 1 and liv5 = 1). The result should be 6. And this calculation should be repeated fro each row in my dataframe..

                                                                                  I do know how to do this in Python, but not in R..

                                                                                  Any help is highly appreciated!

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-03 at 11:16

                                                                                  One option with dplyr could be:

                                                                                  df %>%
                                                                                      rowwise() %>%
                                                                                      mutate(liv6 = with(rle(c_across(liv:liv5)), values[which.max(values != 0)]))
                                                                                  
                                                                                       MD   liv  liv2  liv3  liv4  liv5  liv6
                                                                                          
                                                                                  1   100     0     6     1     1     0     6
                                                                                  2   200     0     2     1     0     2     2
                                                                                  3   300     1     0     1     0     7     1
                                                                                  4   400     3     4     1     3     9     3
                                                                                  5   500     4     5     1     5    10     4
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70970158

                                                                                  QUESTION

                                                                                  pivot_longer with column pairs
                                                                                  Asked 2022-Feb-03 at 14:02

                                                                                  I am again struggling with transforming a wide df into a long one using pivot_longer The data frame is a result of power analysis for different effect sizes and sample sizes, this is how the original df looks like:

                                                                                    es_issue_owner es_independence es_party pwr_issue_owner_1200 pwr_independence_1200 pwr_party_1200 pwr_issue_owner_2400 pwr_independence_2400 pwr_party_2400
                                                                                  1            0.1             0.1      0.1                0.087                 0.080          0.081                0.130                 0.163          0.102
                                                                                  2            0.2             0.2      0.2                0.235                 0.273          0.157                0.406                 0.513          0.267
                                                                                  

                                                                                  Or with dput:

                                                                                  example <- structure(list(es_issue_owner = c(0.1, 0.2), es_independence = c(0.1, 
                                                                                  0.2), es_party = c(0.1, 0.2), pwr_issue_owner_1200 = c(0.087, 
                                                                                  0.235), pwr_independence_1200 = c(0.08, 0.273), pwr_party_1200 = c(0.081, 
                                                                                  0.157), pwr_issue_owner_2400 = c(0.13, 0.406), pwr_independence_2400 = c(0.163, 
                                                                                  0.513), pwr_party_2400 = c(0.102, 0.267)), row.names = 1:2, class = "data.frame")
                                                                                  

                                                                                  Each effect size (es) for three meassures ("independence", "issueowner", "party") is paired with a power calculation on a 1200 and on a 2400 sample size. This is how the output I want to get would look like based on the example above:

                                                                                             type  es  pwr value
                                                                                  1  independence 0.1 1200 0.080
                                                                                  2   issue_owner 0.1 1200 0.087
                                                                                  3         party 0.1 1200 0.081
                                                                                  4  independence 0.2 1200 0.273
                                                                                  5   issue_owner 0.2 1200 0.235
                                                                                  6         party 0.2 1200 0.157
                                                                                  7  independence 0.1 2400 0.163
                                                                                  8   issue_owner 0.1 2400 0.130
                                                                                  9         party 0.1 2400 0.102
                                                                                  10 independence 0.2 2400 0.513
                                                                                  11  issue_owner 0.2 2400 0.406
                                                                                  12        party 0.2 2400 0.267
                                                                                  

                                                                                  or, with dput:

                                                                                  output <- structure(list(type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 
                                                                                  2L, 3L, 1L, 2L, 3L), .Label = c("independence", "issueowner", 
                                                                                  "party"), class = "factor"), es = c(0.1, 0.1, 0.1, 0.2, 0.2, 
                                                                                  0.2, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2), pwr = c(1200, 1200, 1200, 
                                                                                  1200, 1200, 1200, 2400, 2400, 2400, 2400, 2400, 2400), value = c("0.080", 
                                                                                  "0.087", "0.081", "0.273", "0.235", "0.157", "0.163", "0.130", 
                                                                                  "0.102", "0.513", "0.406", "0.267")), out.attrs = list(dim = c(type = 3L, 
                                                                                  es = 2L, pwr = 2L, value = 1L), dimnames = list(type = c("type=independence", 
                                                                                  "type=issueowner", "type=party"), es = c("es=0.1", "es=0.2"), 
                                                                                      pwr = c("pwr=1200", "pwr=2400"), value = "value=NA")), class = "data.frame", row.names = c(NA, 
                                                                                  -12L))
                                                                                  

                                                                                  As a start I tried experimenting with this:

                                                                                  example %>% 
                                                                                    pivot_longer(cols = everything(),
                                                                                                 names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
                                                                                                 # names_sep = "(?=\\d)_(?=\\d)",
                                                                                                 names_to = c("es", "pwr_1200", "pwr_2400"),
                                                                                                 values_to = "value")
                                                                                  

                                                                                  But it did not work, so I tried from two steps, which sort of works, but the "pairing" gets messed up:

                                                                                    example %>% 
                                                                                    # pivot_longer(cols = everything(),
                                                                                    #              names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
                                                                                    #              # names_sep = "(?=\\d)_(?=\\d)",
                                                                                    #              names_to = c("es", "pwr_1200", "pwr_2400"),
                                                                                    #              values_to = "value")
                                                                                    pivot_longer(cols = contains("pwr_"),
                                                                                                 # names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
                                                                                                 names_sep = "_(?=\\d)",
                                                                                                 names_to = c("pwr_type", "pwr_sample"), values_to = "value") %>%
                                                                                    pivot_longer(cols = contains("es_"),
                                                                                                 # names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
                                                                                                 # names_sep = "_(?=\\d)",
                                                                                                 names_to = "es_type", values_to = "es")
                                                                                  

                                                                                  I would appreciate any help!

                                                                                  ANSWER

                                                                                  Answered 2022-Feb-03 at 10:59
                                                                                  library(tidyverse)
                                                                                  
                                                                                  example %>% 
                                                                                    pivot_longer(cols = starts_with("es"), names_to = "type", names_prefix = "es_", values_to = "es") %>%
                                                                                    pivot_longer(cols = starts_with("pwr"), names_to = "pwr", names_prefix = "pwr_") %>% 
                                                                                    filter(substr(type, 1, 3) == substr(pwr, 1, 3)) %>% 
                                                                                    mutate(pwr = parse_number(pwr)) %>% 
                                                                                    arrange(pwr, es, type)
                                                                                  

                                                                                  output

                                                                                     type            es   pwr value
                                                                                   1 independence   0.1  1200 0.08 
                                                                                   2 issue_owner    0.1  1200 0.087
                                                                                   3 party          0.1  1200 0.081
                                                                                   4 independence   0.2  1200 0.273
                                                                                   5 issue_owner    0.2  1200 0.235
                                                                                   6 party          0.2  1200 0.157
                                                                                   7 independence   0.1  2400 0.163
                                                                                   8 issue_owner    0.1  2400 0.13 
                                                                                   9 party          0.1  2400 0.102
                                                                                  10 independence   0.2  2400 0.513
                                                                                  11 issue_owner    0.2  2400 0.406
                                                                                  12 party          0.2  2400 0.267
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70969176

                                                                                  QUESTION

                                                                                  Simulating Random Draws From a "Hat"
                                                                                  Asked 2021-Dec-28 at 21:50

                                                                                  Suppose I have the following 10 variables (num_var_1, num_var_2, num_var_3, num_var_4, num_var_5, factor_var_1, factor_var_2, factor_var_3, factor_var_4, factor_var_5):

                                                                                  set.seed(123)
                                                                                  
                                                                                  num_var_1 <- rnorm(1000, 10, 1)
                                                                                  num_var_2 <- rnorm(1000, 10, 5)
                                                                                  num_var_3 <- rnorm(1000, 10, 10)
                                                                                  num_var_4 <- rnorm(1000, 10, 10)
                                                                                  num_var_5 <- rnorm(1000, 10, 10)
                                                                                  
                                                                                  factor_1 <- c("A","B", "C")
                                                                                  factor_2 <- c("AA","BB", "CC")
                                                                                  factor_3 <- c("AAA","BBB", "CCC", "DDD")
                                                                                  factor_4 <- c("AAAA","BBBB", "CCCC", "DDDD", "EEEE")
                                                                                  factor_5 <- c("AAAAA","BBBBB", "CCCCC", "DDDDD", "EEEEE", "FFFFFF")
                                                                                  
                                                                                  factor_var_1 <- as.factor(sample(factor_1, 1000, replace=TRUE, prob=c(0.3, 0.5, 0.2)))
                                                                                  factor_var_2 <-  as.factor(sample(factor_2, 1000, replace=TRUE, prob=c(0.5, 0.3, 0.2)))
                                                                                  factor_var_3 <-  as.factor(sample(factor_3, 1000, replace=TRUE, prob=c(0.5, 0.2, 0.2, 0.1)))
                                                                                  factor_var_4 <-  as.factor(sample(factor_4, 1000, replace=TRUE, prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))
                                                                                  factor_var_5 <-  as.factor(sample(factor_4, 1000, replace=TRUE, prob=c(0.3, 0.2, 0.1, 0.1, 0.1)))
                                                                                  
                                                                                  id = 1:1000
                                                                                  
                                                                                  my_data = data.frame(id,num_var_1, num_var_2, num_var_3, num_var_4, num_var_5, factor_var_1, factor_var_2, factor_var_3, factor_var_4, factor_var_5)
                                                                                  
                                                                                  
                                                                                  > head(my_data)
                                                                                    id num_var_1 num_var_2 num_var_3 num_var_4  num_var_5 factor_var_1 factor_var_2 factor_var_3 factor_var_4 factor_var_5
                                                                                  1  1  9.439524  5.021006  4.883963  8.496925  11.965498            B           AA          AAA         CCCC         AAAA
                                                                                  2  2  9.769823  4.800225 12.369379  6.722429  16.501132            B           AA          AAA         AAAA         AAAA
                                                                                  3  3 11.558708  9.910099  4.584108 -4.481653  16.710042            C           AA          BBB         AAAA         CCCC
                                                                                  4  4 10.070508  9.339124 22.192276  3.027154  -2.841578            B           CC          DDD         BBBB         AAAA
                                                                                  5  5 10.129288 -2.746714 11.741359 35.984902 -10.261096            B           AA          AAA         DDDD         DDDD
                                                                                  6  6 11.715065 15.202867  3.847317  9.625850  32.053261            B           AA          CCC         BBBB         EEEE
                                                                                  

                                                                                  My Question: I am interested in selecting a random number of variables from this data - and taking random subsets from these variables. (And then repeating this process many times). For example - I would like to record such a randomly generated list:

                                                                                  • Iteration 1: num_var_2 > 12, factor_var_1 = "A, C", factor_var_4 = "BBBB, DDDD, EEEE"

                                                                                  • Iteration 2: num_var_1 >0, num_var_3 <10, factor_var_2 = "AA, BB, CC", factor_var_3 = "AAA", factor_var_5 = "CCCCC, DDDDD"

                                                                                  • Iteration 3: num_var_2 <5, num_var_5 <10, factor_var_1 = "B", factor_var_3 = "AAA"

                                                                                  • Iteration 4 : factor_var_4 = "BBBB"

                                                                                  etc.

                                                                                  I can perform the above manually, but this would take a long time (e.g. 10 iterations). Is there a way to automate this process and in the end, just output this kind of list (10 rows × 2 columns) :

                                                                                  Iteration                                                                                                  Condition
                                                                                  1                                               num_var_2 > 12, factor_var_1 = A, C, factor_var_4 = BBBB, DDDD, EEEE
                                                                                  2            num_var_1 >0, num_var_3 <10, factor_var_2 = AA, BB, CC, factor_var_3 = AAA, factor_var_5 = CCCCC, DDDDD
                                                                                  3                                                  num_var_2 <5, num_var_5 <10, factor_var_1 = B, factor_var_3 = AAA
                                                                                  4                                                                                                factor_var_4 = BBBB
                                                                                  

                                                                                  Can someone please show me how to do this?

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-26 at 10:11

                                                                                  You may define a function FUN(n) that creates a data set as shown in OP.

                                                                                  FUN <- function(n=1e3) {
                                                                                    num_var_1 <- rnorm(n, 10, 1)
                                                                                    num_var_2 <- rnorm(n, 10, 5)
                                                                                    num_var_3 <- rnorm(n, 10, 10)
                                                                                    num_var_4 <- rnorm(n, 10, 10)
                                                                                    num_var_5 <- rnorm(n, 10, 10)
                                                                                    factor_1 <- c("A", "B", "C")
                                                                                    factor_2 <- c("AA", "BB", "CC")
                                                                                    factor_3 <- c("AAA", "BBB", "CCC", "DDD")
                                                                                    factor_4 <- c("AAAA", "BBBB", "CCCC", "DDDD", "EEEE")
                                                                                    factor_5 <- c("AAAAA", "BBBBB", "CCCCC", "DDDDD", "EEEEE", "FFFFFF")
                                                                                    factor_var_1 <- as.factor(sample(factor_1, n, replace=TRUE, 
                                                                                                                     prob=c(0.3, 0.5, 0.2)))
                                                                                    factor_var_2 <- as.factor(sample(factor_2, n, replace=TRUE, 
                                                                                                                     prob=c(0.5, 0.3, 0.2)))
                                                                                    factor_var_3 <- as.factor(sample(factor_3, n, replace=TRUE, 
                                                                                                                     prob=c(0.5, 0.2, 0.2, 0.1)))
                                                                                    factor_var_4 <- as.factor(sample(factor_4, n, replace=TRUE, 
                                                                                                                     prob=c(0.5, 0.2, 0.1, 0.1, 0.1)))
                                                                                    factor_var_5 <- as.factor(sample(factor_5, n, replace=TRUE, 
                                                                                                                     prob=c(0.3, 0.2, 0.1, 0.1, 0.1, .2)))
                                                                                    id <- 1:n
                                                                                    return(data.frame(id, num_var_1, num_var_2, num_var_3, num_var_4, 
                                                                                                      num_var_5, factor_var_1, factor_var_2, factor_var_3,
                                                                                                      factor_var_4, factor_var_5))
                                                                                  }
                                                                                  

                                                                                  Next, define (appropriate) expressions as strings in a list evl.

                                                                                  evl <- list(
                                                                                    c('num_var_2 > 12', 'factor_var_1 %in% c("A", "C")', 
                                                                                      'factor_var_4 %in% c("BBBB", "DDDD", "EEEE")'),
                                                                                    c('num_var_1 > 0', 'num_var_3 < 10', 'factor_var_2 %in% c("AA", "BB", "CC")',
                                                                                      'factor_var_3 %in% "AAA"', 'factor_var_5 %in% c("CCCCC", "DDDDD")'),
                                                                                    c('num_var_2 < 5', 'num_var_5 < 10', 'factor_var_1 %in% "B"',
                                                                                      'factor_var_3 %in% "AAA"'),
                                                                                    c('factor_var_4 %in% "BBBB"')
                                                                                  )
                                                                                  

                                                                                  Finally, in Map define a function that subsets the data of one replicateion according to the respective expressions using eval(parse(text=)). Use set.seed() outside the function to prevent the same data from being generated on each iteration.

                                                                                  set.seed(42)
                                                                                  result <- Map(\(x, y) x[with(x, eval(parse(text=paste(y, collapse=' & ')))), ],
                                                                                                replicate(length(evl), FUN(), simplify=FALSE),
                                                                                                evl)
                                                                                  

                                                                                  Note: R version 4.1.2 (2021-11-01)

                                                                                  Gives
                                                                                  str(result)
                                                                                  # List of 4
                                                                                  # $ :'data.frame':  59 obs. of  11 variables:
                                                                                  #   ..$ id          : int [1:59] 3 6 25 29 32 34 52 54 58 93 ...
                                                                                  # ..$ num_var_1   : num [1:59] 9.99 10.95 9.38 8.53 9.65 ...
                                                                                  # ..$ num_var_2   : num [1:59] 13.6 17.4 20.3 19.3 16.1 ...
                                                                                  # ..$ num_var_3   : num [1:59] 9.42 18.67 6.1 25.71 -2.73 ...
                                                                                  # ..$ num_var_4   : num [1:59] 6.29 9.22 3.68 16.27 15.77 ...
                                                                                  # ..$ num_var_5   : num [1:59] 13.37 18.86 4.89 24.18 26.11 ...
                                                                                  # ..$ factor_var_1: Factor w/ 3 levels "A","B","C": 3 1 3 1 3 3 1 3 1 1 ...
                                                                                  # ..$ factor_var_2: Factor w/ 3 levels "AA","BB","CC": 3 3 1 1 1 2 3 3 1 3 ...
                                                                                  # ..$ factor_var_3: Factor w/ 4 levels "AAA","BBB","CCC",..: 1 1 2 1 1 4 2 1 3 2 ...
                                                                                  # ..$ factor_var_4: Factor w/ 5 levels "AAAA","BBBB",..: 5 2 2 2 2 2 5 2 4 4 ...
                                                                                  # ..$ factor_var_5: Factor w/ 6 levels "AAAAA","BBBBB",..: 3 5 2 3 5 4 4 6 1 6 ...
                                                                                  # $ :'data.frame':  53 obs. of  11 variables:
                                                                                  #   ..$ id          : int [1:53] 2 14 28 36 49 59 75 103 134 137 ...
                                                                                  # ..$ num_var_1   : num [1:53] 9.67 11.61 11.22 10.14 10.5 ...
                                                                                  # ..$ num_var_2   : num [1:53] 10.89 7.12 2.38 13.28 10.88 ...
                                                                                  # ..$ num_var_3   : num [1:53] 5.87 7.33 2.88 -10.78 4.09 ...
                                                                                  # ..$ num_var_4   : num [1:53] 19.239 6.261 -0.158 14.586 -0.544 ...
                                                                                  # ..$ num_var_5   : num [1:53] -5.1 21.04 2.81 1.76 27.19 ...
                                                                                  # ..$ factor_var_1: Factor w/ 3 levels "A","B","C": 1 1 1 2 3 2 3 3 2 3 ...
                                                                                  # ..$ factor_var_2: Factor w/ 3 levels "AA","BB","CC": 2 2 2 3 3 3 3 2 1 1 ...
                                                                                  # ..$ factor_var_3: Factor w/ 4 levels "AAA","BBB","CCC",..: 1 1 1 1 1 1 1 1 1 1 ...
                                                                                  # ..$ factor_var_4: Factor w/ 5 levels "AAAA","BBBB",..: 1 5 5 1 4 4 4 4 1 4 ...
                                                                                  # ..$ factor_var_5: Factor w/ 6 levels "AAAAA","BBBBB",..: 3 4 4 3 3 4 4 4 4 3 ...
                                                                                  # $ :'data.frame':  20 obs. of  11 variables:
                                                                                  #   ..$ id          : int [1:20] 3 44 91 181 222 233 241 287 293 302 ...
                                                                                  # ..$ num_var_1   : num [1:20] 12 10.26 9.65 8.48 12.1 ...
                                                                                  # ..$ num_var_2   : num [1:20] 3.68 3.61 3.28 4.01 1.78 ...
                                                                                  # ..$ num_var_3   : num [1:20] 4.113 -3.481 17.654 0.496 5.457 ...
                                                                                  # ..$ num_var_4   : num [1:20] 9.25 19.79 17.15 -4.72 22.16 ...
                                                                                  # ..$ num_var_5   : num [1:20] 6 8.49 4.31 4.67 1.96 ...
                                                                                  # ..$ factor_var_1: Factor w/ 3 levels "A","B","C": 2 2 2 2 2 2 2 2 2 2 ...
                                                                                  # ..$ factor_var_2: Factor w/ 3 levels "AA","BB","CC": 2 1 3 1 1 1 1 3 2 1 ...
                                                                                  # ..$ factor_var_3: Factor w/ 4 levels "AAA","BBB","CCC",..: 1 1 1 1 1 1 1 1 1 1 ...
                                                                                  # ..$ factor_var_4: Factor w/ 5 levels "AAAA","BBBB",..: 3 1 1 1 1 1 1 1 1 1 ...
                                                                                  # ..$ factor_var_5: Factor w/ 6 levels "AAAAA","BBBBB",..: 3 5 5 1 1 1 2 6 1 2 ...
                                                                                  # $ :'data.frame':  205 obs. of  11 variables:
                                                                                  #   ..$ id          : int [1:205] 7 10 23 24 27 29 31 33 38 40 ...
                                                                                  # ..$ num_var_1   : num [1:205] 10.23 9.78 8.92 10.16 9.93 ...
                                                                                  # ..$ num_var_2   : num [1:205] 23.49 13.06 12.17 16.88 7.93 ...
                                                                                  # ..$ num_var_3   : num [1:205] 6.33 9.33 14.04 21.66 28.56 ...
                                                                                  # ..$ num_var_4   : num [1:205] 16.33 -1.805 0.509 21.2 15.158 ...
                                                                                  # ..$ num_var_5   : num [1:205] 8.48 -1.31 5.03 15.07 19.48 ...
                                                                                  # ..$ factor_var_1: Factor w/ 3 levels "A","B","C": 1 1 2 1 2 1 2 2 3 2 ...
                                                                                  # ..$ factor_var_2: Factor w/ 3 levels "AA","BB","CC": 3 1 1 2 1 1 1 2 1 3 ...
                                                                                  # ..$ factor_var_3: Factor w/ 4 levels "AAA","BBB","CCC",..: 1 2 3 1 3 4 3 1 3 2 ...
                                                                                  # ..$ factor_var_4: Factor w/ 5 levels "AAAA","BBBB",..: 2 2 2 2 2 2 2 2 2 2 ...
                                                                                  # ..$ factor_var_5: Factor w/ 6 levels "AAAAA","BBBBB",..: 3 5 2 6 6 2 6 1 2 2 ...
                                                                                  

                                                                                  Source https://stackoverflow.com/questions/70483731

                                                                                  QUESTION

                                                                                  Break Apart a String into Separate Columns R
                                                                                  Asked 2021-Dec-17 at 20:39

                                                                                  I am trying to tidy up some data that is all contained in 1 column called "game_info" as a string. This data contains college basketball upcoming game data, with the Date, Time, Team IDs, Team Names, etc. Ideally each one of those would be their own column. I have tried separating with a space delimiter, but that has not worked well since there are teams such as "Duke" with 1 part to their name, and teams with 2 to 3 parts to their name (Michigan State, South Dakota State, etc). There also teams with "-" dashes in their name.

                                                                                  Here is my data:

                                                                                  df <- data.frame(list(
                                                                                    game_info = c(
                                                                                      "12/16 7:00 PM 751 Appalachian State 752 Duke",
                                                                                      "12/16 7:00 PM 753 Chicago State 754 Indiana-Purdue",
                                                                                      "12/16 8:00 PM 755 Texas-Arlington 756 Oral Roberts", 
                                                                                      "12/16 10:00 PM 757 Dartmouth 758 Stanford"
                                                                                      )
                                                                                    ))
                                                                                  

                                                                                  Desired output:

                                                                                  date  time     away_team_id  away_team_name     home_team_id home_team_name
                                                                                  12/16 7:00 PM    751         Appalachian State  752          Duke
                                                                                  12/16 7:00 PM    753         Chicago State      754          Indiana-Purdue
                                                                                  12/16 8:00 PM    755         Texas-Arlington    756          Oral Roberts
                                                                                  12/16 10:00 PM   757         Dartmouth          758          Stanford
                                                                                  

                                                                                  @Jonny Phelps @doRemy

                                                                                  ANSWER

                                                                                  Answered 2021-Dec-16 at 15:25

                                                                                  Here's one with regex. See regex101 link for the regex explanations

                                                                                  regex <- "^(\\d{2}\\/\\d{2})\\s*(\\d{1,2}:\\d{2}\\s*(PM|AM))\\s*(\\d+)\\s*([^\\d.]+)(\\d+)\\s*([^\\d.]+)$"
                                                                                  
                                                                                  data <- data.frame(game_info=
                                                                                    "12/16 7:00 PM 751 Appalachian State 752 Duke"
                                                                                    ,"12/16 7:00 PM 753 Chicago State 754 Indiana-Purdue"
                                                                                    ,"12/16 8:00 PM 755 Texas-Arlington 756 Oral Roberts"
                                                                                    ,"12/16 10:00 AM 757 Dartmouth 758 Stanford"
                                                                                  )
                                                                                  library(stringr)
                                                                                  
                                                                                  out <- do.call(rbind, str_match_all(data, regex))
                                                                                  out <- as.data.frame(out)
                                                                                  # remove full string & AM/PM
                                                                                  out$V1 <- NULL
                                                                                  out$V4 <- NULL
                                                                                  names(out) <- c("date", "time", "away_team_id", "away_team_name",
                                                                                                  "home_team_id", "home_team_name")
                                                                                  # remove white space from end
                                                                                  out$away_team_name <- trimws(out$away_team_name)
                                                                                  out$home_team_name <- trimws(out$home_team_name)
                                                                                  out
                                                                                  

                                                                                  Explanation:

                                                                                  ^(\d{2}/\d{2}) - starts with 2 digits/2 digits like 12/16. ^ is a start anchor and () are used to say we want to capture this group for plucking out

                                                                                  \s* - 0 or more spaces between our first group and the next

                                                                                  (\d{1,2}:\d{2}\s*(PM|AM)) - want 1 or 2 digits : 2 digits, then possibly a space and PM or AM

                                                                                  \s*(\d+)\s* - spaces around any number of digits, the first id

                                                                                  ([^\d.]+) - all non numeric characters. This will fall down if there are ever numbers in your team names. If so, find some examples and we can improve it. White space is captured afterwards so is removed later with trimws

                                                                                  (\d+)\s* - second id and spaces

                                                                                  ([^\d.]+)$ - finally the other team name and the end sentence anchor

                                                                                  Source https://stackoverflow.com/questions/70381064

                                                                                  Community Discussions, Code Snippets contain sources that include Stack Exchange Network

                                                                                  Vulnerabilities

                                                                                  No vulnerabilities reported

                                                                                  Install BullshitGenerator

                                                                                  You can download it from GitHub.

                                                                                  Support

                                                                                  For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
                                                                                  Find more information at:
                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit
                                                                                  CLONE
                                                                                • HTTPS

                                                                                  https://github.com/menzi11/BullshitGenerator.git

                                                                                • CLI

                                                                                  gh repo clone menzi11/BullshitGenerator

                                                                                • sshUrl

                                                                                  git@github.com:menzi11/BullshitGenerator.git

                                                                                • Share this Page

                                                                                  share link

                                                                                  Explore Related Topics

                                                                                  Consider Popular Data Manipulation Libraries

                                                                                  Try Top Libraries by menzi11

                                                                                  utfdir

                                                                                  by menzi11Python

                                                                                  nomake

                                                                                  by menzi11Python

                                                                                  EZFile

                                                                                  by menzi11Python

                                                                                  CleanZhiHu

                                                                                  by menzi11JavaScript

                                                                                  Compare Data Manipulation Libraries with Highest Support

                                                                                  numpy

                                                                                  by numpy

                                                                                  hapi-fhir

                                                                                  by jamesagnew

                                                                                  nbconvert

                                                                                  by jupyter

                                                                                  protege

                                                                                  by protegeproject

                                                                                  database

                                                                                  by blazegraph

                                                                                  Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
                                                                                  Find more libraries
                                                                                  Explore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits​
                                                                                  Save this library and start creating your kit