Data Science

share link
Explore all Data Science open source software, libraries, packages, source code, cloud functions and APIs.

Popular New Releases in Data Science

No Popular Releases are available at this moment for Data Science

Popular Libraries in Data Science

No Trending Libraries are available at this moment for Data Science

Trending New libraries in Data Science

No Trending Libraries are available at this moment for Data Science

Top Authors in Data Science

No Top Authors are available at this moment for Data Science.

Trending Kits in Data Science

No Trending Kits are available at this moment for Data Science

Trending Discussions on Data Science

Pandas merge multiple dataframes on one temporal index, with latest value from all others

C# Using class specific member references that child "base" calls respect

Python script to repeatedly read a sensor errors

renv + venv + jupyterlab + IRkernel: will it blend?

What does runif() mean when used inside if_else()?

Create new boolean fields based on specific bigrams appearing in a tokenized pandas dataframe

Webpage starts zoomed out on mobile devices

Do random functions such as sample work when I deploy a Shiny App?

How can I check a confusion_matrix after fine-tuning with custom datasets?

How to rewrite this deprecated expression using do and "by", with "groupby" (Julia)

QUESTION

Pandas merge multiple dataframes on one temporal index, with latest value from all others

Asked 2022-Apr-16 at 03:35

I'm merging some dataframes which have a time index.

1import pandas as pd
2df1 = pd.DataFrame(['a', 'b', 'c'],
3    columns=pd.MultiIndex.from_product([['target'], ['key']]),
4    index = [
5        '2022-04-15 20:20:20.000000', 
6        '2022-04-15 20:20:21.000000', 
7        '2022-04-15 20:20:22.000000'],)
8df2 = pd.DataFrame(['a2', 'b2', 'c2', 'd2', 'e2'],
9    columns=pd.MultiIndex.from_product([['feature2'], ['keys']]),
10    index = [
11        '2022-04-15 20:20:20.100000', 
12        '2022-04-15 20:20:20.500000', 
13        '2022-04-15 20:20:20.900000', 
14        '2022-04-15 20:20:21.000000', 
15        '2022-04-15 20:20:21.100000',],)
16df3 = pd.DataFrame(['a3', 'b3', 'c3', 'd3', 'e3'],
17    columns=pd.MultiIndex.from_product([['feature3'], ['keys']]),
18    index = [
19        '2022-04-15 20:20:19.000000', 
20        '2022-04-15 20:20:19.200000', 
21        '2022-04-15 20:20:20.000000', 
22        '2022-04-15 20:20:20.200000', 
23        '2022-04-15 20:20:23.100000',],)
24

then I use this merge procedure:

1import pandas as pd
2df1 = pd.DataFrame(['a', 'b', 'c'],
3    columns=pd.MultiIndex.from_product([['target'], ['key']]),
4    index = [
5        '2022-04-15 20:20:20.000000', 
6        '2022-04-15 20:20:21.000000', 
7        '2022-04-15 20:20:22.000000'],)
8df2 = pd.DataFrame(['a2', 'b2', 'c2', 'd2', 'e2'],
9    columns=pd.MultiIndex.from_product([['feature2'], ['keys']]),
10    index = [
11        '2022-04-15 20:20:20.100000', 
12        '2022-04-15 20:20:20.500000', 
13        '2022-04-15 20:20:20.900000', 
14        '2022-04-15 20:20:21.000000', 
15        '2022-04-15 20:20:21.100000',],)
16df3 = pd.DataFrame(['a3', 'b3', 'c3', 'd3', 'e3'],
17    columns=pd.MultiIndex.from_product([['feature3'], ['keys']]),
18    index = [
19        '2022-04-15 20:20:19.000000', 
20        '2022-04-15 20:20:19.200000', 
21        '2022-04-15 20:20:20.000000', 
22        '2022-04-15 20:20:20.200000', 
23        '2022-04-15 20:20:23.100000',],)
24def merge(dfs:list[pd.DataFrame], targetColumn:'str|tuple[str]'):
25    from functools import reduce
26    if len(dfs) == 0:
27        return None
28    if len(dfs) == 1:
29        return dfs[0]
30    for df in dfs:
31        df.index = pd.to_datetime(df.index)
32    merged = reduce(
33        lambda left, right: pd.merge(
34            left, 
35            right, 
36            how='outer',
37            left_index=True,
38            right_index=True),
39        dfs)
40    for col in merged.columns:
41        if col != targetColumn:
42            merged[col] = merged[col].fillna(method='ffill')
43    return merged[merged[targetColumn].notna()]
44

like this:

1import pandas as pd
2df1 = pd.DataFrame(['a', 'b', 'c'],
3    columns=pd.MultiIndex.from_product([['target'], ['key']]),
4    index = [
5        '2022-04-15 20:20:20.000000', 
6        '2022-04-15 20:20:21.000000', 
7        '2022-04-15 20:20:22.000000'],)
8df2 = pd.DataFrame(['a2', 'b2', 'c2', 'd2', 'e2'],
9    columns=pd.MultiIndex.from_product([['feature2'], ['keys']]),
10    index = [
11        '2022-04-15 20:20:20.100000', 
12        '2022-04-15 20:20:20.500000', 
13        '2022-04-15 20:20:20.900000', 
14        '2022-04-15 20:20:21.000000', 
15        '2022-04-15 20:20:21.100000',],)
16df3 = pd.DataFrame(['a3', 'b3', 'c3', 'd3', 'e3'],
17    columns=pd.MultiIndex.from_product([['feature3'], ['keys']]),
18    index = [
19        '2022-04-15 20:20:19.000000', 
20        '2022-04-15 20:20:19.200000', 
21        '2022-04-15 20:20:20.000000', 
22        '2022-04-15 20:20:20.200000', 
23        '2022-04-15 20:20:23.100000',],)
24def merge(dfs:list[pd.DataFrame], targetColumn:'str|tuple[str]'):
25    from functools import reduce
26    if len(dfs) == 0:
27        return None
28    if len(dfs) == 1:
29        return dfs[0]
30    for df in dfs:
31        df.index = pd.to_datetime(df.index)
32    merged = reduce(
33        lambda left, right: pd.merge(
34            left, 
35            right, 
36            how='outer',
37            left_index=True,
38            right_index=True),
39        dfs)
40    for col in merged.columns:
41        if col != targetColumn:
42            merged[col] = merged[col].fillna(method='ffill')
43    return merged[merged[targetColumn].notna()]
44merged = merge([df1, df2, df3], targetColumn=('target', 'key'))
45

which produces this:

target production

And it all works great. Problem is efficiency - notice in the merge() procedure I use reduce and an outer merge to join the dataframes together, this can make a HUGE interim dataframe which then gets filtered down. But what if my pc doesn't have enough ram to handle that huge dataframe in memory? well that's the problem I'm trying to avoid.

I'm wondering if there's a way to avoid expanding the data out into a huge dataframe while merging.

Of course a regular old merge isn't sufficient because it only merges on exactly matching indexes rather than the latest temporal index before the target variable's observation:

1import pandas as pd
2df1 = pd.DataFrame(['a', 'b', 'c'],
3    columns=pd.MultiIndex.from_product([['target'], ['key']]),
4    index = [
5        '2022-04-15 20:20:20.000000', 
6        '2022-04-15 20:20:21.000000', 
7        '2022-04-15 20:20:22.000000'],)
8df2 = pd.DataFrame(['a2', 'b2', 'c2', 'd2', 'e2'],
9    columns=pd.MultiIndex.from_product([['feature2'], ['keys']]),
10    index = [
11        '2022-04-15 20:20:20.100000', 
12        '2022-04-15 20:20:20.500000', 
13        '2022-04-15 20:20:20.900000', 
14        '2022-04-15 20:20:21.000000', 
15        '2022-04-15 20:20:21.100000',],)
16df3 = pd.DataFrame(['a3', 'b3', 'c3', 'd3', 'e3'],
17    columns=pd.MultiIndex.from_product([['feature3'], ['keys']]),
18    index = [
19        '2022-04-15 20:20:19.000000', 
20        '2022-04-15 20:20:19.200000', 
21        '2022-04-15 20:20:20.000000', 
22        '2022-04-15 20:20:20.200000', 
23        '2022-04-15 20:20:23.100000',],)
24def merge(dfs:list[pd.DataFrame], targetColumn:'str|tuple[str]'):
25    from functools import reduce
26    if len(dfs) == 0:
27        return None
28    if len(dfs) == 1:
29        return dfs[0]
30    for df in dfs:
31        df.index = pd.to_datetime(df.index)
32    merged = reduce(
33        lambda left, right: pd.merge(
34            left, 
35            right, 
36            how='outer',
37            left_index=True,
38            right_index=True),
39        dfs)
40    for col in merged.columns:
41        if col != targetColumn:
42            merged[col] = merged[col].fillna(method='ffill')
43    return merged[merged[targetColumn].notna()]
44merged = merge([df1, df2, df3], targetColumn=('target', 'key'))
45df1.merge(df2, how='left', left_index=True, right_index=True)
46

efficient but bad merge

Has this kind of thing been solved efficiently? Seems like a common data science issue, since no one wants to leak future information into their models, and everyone has various inputs to merge together...

ANSWER

Answered 2022-Apr-16 at 00:45

You're in luck: pandas.merge_asof does exactly what you need!

We use the default direction='backward' argument:

A “backward” search selects the last row in the right DataFrame whose ‘on’ key is less than or equal to the left’s key.

Using your three example DataFrames:

1import pandas as pd
2df1 = pd.DataFrame(['a', 'b', 'c'],
3    columns=pd.MultiIndex.from_product([['target'], ['key']]),
4    index = [
5        '2022-04-15 20:20:20.000000', 
6        '2022-04-15 20:20:21.000000', 
7        '2022-04-15 20:20:22.000000'],)
8df2 = pd.DataFrame(['a2', 'b2', 'c2', 'd2', 'e2'],
9    columns=pd.MultiIndex.from_product([['feature2'], ['keys']]),
10    index = [
11        '2022-04-15 20:20:20.100000', 
12        '2022-04-15 20:20:20.500000', 
13        '2022-04-15 20:20:20.900000', 
14        '2022-04-15 20:20:21.000000', 
15        '2022-04-15 20:20:21.100000',],)
16df3 = pd.DataFrame(['a3', 'b3', 'c3', 'd3', 'e3'],
17    columns=pd.MultiIndex.from_product([['feature3'], ['keys']]),
18    index = [
19        '2022-04-15 20:20:19.000000', 
20        '2022-04-15 20:20:19.200000', 
21        '2022-04-15 20:20:20.000000', 
22        '2022-04-15 20:20:20.200000', 
23        '2022-04-15 20:20:23.100000',],)
24def merge(dfs:list[pd.DataFrame], targetColumn:'str|tuple[str]'):
25    from functools import reduce
26    if len(dfs) == 0:
27        return None
28    if len(dfs) == 1:
29        return dfs[0]
30    for df in dfs:
31        df.index = pd.to_datetime(df.index)
32    merged = reduce(
33        lambda left, right: pd.merge(
34            left, 
35            right, 
36            how='outer',
37            left_index=True,
38            right_index=True),
39        dfs)
40    for col in merged.columns:
41        if col != targetColumn:
42            merged[col] = merged[col].fillna(method='ffill')
43    return merged[merged[targetColumn].notna()]
44merged = merge([df1, df2, df3], targetColumn=('target', 'key'))
45df1.merge(df2, how='left', left_index=True, right_index=True)
46import pandas as pd
47from functools import reduce
48
49# Convert all indexes to datetime
50for df in [df1, df2, df3]:
51    df.index = pd.to_datetime(df.index)
52
53# Perform as-of merges
54res = reduce(lambda left, right:
55             pd.merge_asof(left, right, left_index=True, right_index=True),
56             [df1, df2, df3])
57
58print(res)
59
60                    target feature2 feature3
61                       key     keys     keys
622022-04-15 20:20:20      a      NaN       c3
632022-04-15 20:20:21      b       d2       d3
642022-04-15 20:20:22      c       e2       d3
65

Source https://stackoverflow.com/questions/71889742

QUESTION

C# Using class specific member references that child "base" calls respect

Asked 2022-Apr-01 at 10:38

I'm currently working on a codebase and struggling to find an optimal and clean solution. I've removed the context of the problem to help simplify it to its root components. The Scale property is a simplification for a more complex state of the class in the actual codebase. I have an idea (which I'll reference at the bottom) for how I could solve this issue - however the solution feels messy and just avoids the area I want to better understand.

Class Hierarchy

1public class GreatGrandparent
2{
3    public virtual int Scale { get; set; } = 1;
4    public virtual int GetTrueScale()
5    {
6        return Scale;
7    }
8}
9
10public class Grandparent : GreatGrandparent
11{
12    public override int Scale { get; set; } = 2;
13    public override int GetTrueScale()
14    {
15        return Scale * base.GetTrueScale();
16    }
17}
18
19public class Parent : Grandparent
20{
21    public override int Scale { get; set; } = 8;
22}
23
24public class Child : Parent
25{
26    public override int Scale { get; set; } = 4;
27}
28

Somewhere else in code:

1public class GreatGrandparent
2{
3    public virtual int Scale { get; set; } = 1;
4    public virtual int GetTrueScale()
5    {
6        return Scale;
7    }
8}
9
10public class Grandparent : GreatGrandparent
11{
12    public override int Scale { get; set; } = 2;
13    public override int GetTrueScale()
14    {
15        return Scale * base.GetTrueScale();
16    }
17}
18
19public class Parent : Grandparent
20{
21    public override int Scale { get; set; } = 8;
22}
23
24public class Child : Parent
25{
26    public override int Scale { get; set; } = 4;
27}
28public class Main 
29{
30    Child aChild = new Child();
31    int aChildTrueScale = aChild.GetTrueScale();
32}
33
  • Expected Result: 4 (4×1) (Refer to Edit 1)
  • Actual Result: 16 (4×4)
  • Desired Result: 64 (4×8×2×1)

I want a child to find its relative scale by taking in all factors of scale from its parents, so that would like:

child relative scale = child scale × parent scale × … × base class scale

How can I (if possible) define the GetTrueScale method once in the parent class to get the desired result - which all children inherit - to avoid continuously overriding the method with duplicate implementations (the exception being the GreatGrandparent).

"Messy" Solution

Define a separate property/field in each class, and continuously override the aChildTrueScale() method with a return of ClassScale * base.GetTrueScale() where the ClassScale is a different property on each Class.

Edit 1

The expected result was my initial expectation based on my understanding at the time - thinking that within a base call the Scale reference would respect the change in scope change value to match that of the base class. With some further testing it appears that regardless of what scope when a base method is called, the referenced Scale value is always from the initial objects scope (hence 4*4).

Is it possible to refer to properties based on their scope? So in a base.GetTrueScale() call, any references within that function call will be on the base scope. Or am I completely missing something/trying to over simplify children?

Footnote

I've got a a bit of experience with procedural programming around data science, however I'm fairly inexperienced with object-oriented programming so forgive me if I'm ignorant to some core concepts. I’m happy to help clarify anything, thanks for taking the time to look over my first question! ^-^

(If anyone can think of a better title please let me know and I'll fix it up - was struggling to define the issue simply)

ANSWER

Answered 2022-Apr-01 at 08:32

The type hierarchy will be called in the order from most base type -> most derived.

As you do not have overriden methods in Parent then your Scale is not multiplied. That it is a reason why you got 16. It is better to debug and see order of execution of your code.

You can add override GetTrueScale() method of class Parent to have desired value 64. The whole code will look like this:

1public class GreatGrandparent
2{
3    public virtual int Scale { get; set; } = 1;
4    public virtual int GetTrueScale()
5    {
6        return Scale;
7    }
8}
9
10public class Grandparent : GreatGrandparent
11{
12    public override int Scale { get; set; } = 2;
13    public override int GetTrueScale()
14    {
15        return Scale * base.GetTrueScale();
16    }
17}
18
19public class Parent : Grandparent
20{
21    public override int Scale { get; set; } = 8;
22}
23
24public class Child : Parent
25{
26    public override int Scale { get; set; } = 4;
27}
28public class Main 
29{
30    Child aChild = new Child();
31    int aChildTrueScale = aChild.GetTrueScale();
32}
33public class GreatGrandparent
34{
35    public virtual int Scale { get; set; } = 1;
36
37    public virtual int GetTrueScale()
38    {
39        Console.WriteLine("GreatGrandparent: " + Scale);
40        return Scale;
41    }
42}
43
44public class Grandparent : GreatGrandparent
45{
46    public override int Scale { get; set; } = 2;
47
48    public override int GetTrueScale()
49    {
50        Console.WriteLine("Grandparent: " + Scale);
51        return Scale * base.GetTrueScale();
52    }
53}
54
55public class Parent : Grandparent
56{
57    public override int Scale { get; set; } = 8;
58
59    public override int GetTrueScale()
60    {
61        Console.WriteLine("Grandparent: " + Scale);
62        return Scale * base.GetTrueScale();
63    }
64}
65

and Child class:

1public class GreatGrandparent
2{
3    public virtual int Scale { get; set; } = 1;
4    public virtual int GetTrueScale()
5    {
6        return Scale;
7    }
8}
9
10public class Grandparent : GreatGrandparent
11{
12    public override int Scale { get; set; } = 2;
13    public override int GetTrueScale()
14    {
15        return Scale * base.GetTrueScale();
16    }
17}
18
19public class Parent : Grandparent
20{
21    public override int Scale { get; set; } = 8;
22}
23
24public class Child : Parent
25{
26    public override int Scale { get; set; } = 4;
27}
28public class Main 
29{
30    Child aChild = new Child();
31    int aChildTrueScale = aChild.GetTrueScale();
32}
33public class GreatGrandparent
34{
35    public virtual int Scale { get; set; } = 1;
36
37    public virtual int GetTrueScale()
38    {
39        Console.WriteLine("GreatGrandparent: " + Scale);
40        return Scale;
41    }
42}
43
44public class Grandparent : GreatGrandparent
45{
46    public override int Scale { get; set; } = 2;
47
48    public override int GetTrueScale()
49    {
50        Console.WriteLine("Grandparent: " + Scale);
51        return Scale * base.GetTrueScale();
52    }
53}
54
55public class Parent : Grandparent
56{
57    public override int Scale { get; set; } = 8;
58
59    public override int GetTrueScale()
60    {
61        Console.WriteLine("Grandparent: " + Scale);
62        return Scale * base.GetTrueScale();
63    }
64}
65public class Child : Parent
66{
67    public override int Scale { get; set; } = 4;
68}
69

Source https://stackoverflow.com/questions/71703497

QUESTION

Python script to repeatedly read a sensor errors

Asked 2022-Mar-04 at 07:42

Hello stackoverflow community!

I am writing a python script to repeatedly retrieve a value from a PiHat sensor. I found this answer on stack overflow Run a python function every second but I run into the following error after it completes one loop: ValueError: sleep length must be non-negative

Here is the code

1from sense_hat import SenseHat
2import time
3
4start_time = time.time()
5interval = 1
6sense = SenseHat()
7# humidity = sense.get_humidity() *move this to loop
8
9for i in range(20):
10  humidity = sense.get_humidity()
11  sense.clear()
12  print(humidity)
13  time.sleep(start_time + i*interval - time.time())
14  # f()
15

*edited the above code to fetch sensor values in the loop in case anyone wants to run similar code. Otherwise, it will just display the same value repeatedly I have kept everything else the same for reference and so you can see where I went wrong. Also commented out f().

No one seemed to have mentioned the issue on the thread I found it on. I have run data science projects with python but not many traditional programs so I apologize if the formatting is off.

Basically, could you explain why it does this and maybe some possible solutions to make it work. If you can explain why, it would really help my understanding of python. The sample code I found this on is very old and possibly an outdated method? I am running this on a raspberry Pi 4 b so it is possibly not executing fast enough?

I appreciate the help

ANSWER

Answered 2022-Mar-04 at 07:28

As mentioned in the comments, in the first iteration of the for-loop, "i" is 0. You should take into account that the calculated value may become negative (hence the ValueError: sleep length must be non-negative).

You can use max(0, start_time + i*interval - time.time()) or change the range to range(1,21) to prevent this from happening.

As far as the differences between Pycharm and on the Raspberry, it might have to do with the clock precision of the two machines (different time yield by time.time())

Source https://stackoverflow.com/questions/71347318

QUESTION

renv + venv + jupyterlab + IRkernel: will it blend?

Asked 2022-Feb-24 at 20:06
Short version

What is the simple and elegant way to use renv, venv and jupyterlab with IRkernel together? In particular, how to automatically activate renv from jupyter notebook that is not in the root directory?

Long version

I'm embracing a "polyglot" data science style, which means using both python and R in tandem. Now venv is awesome, and renv is awesome, and jupyterlab is awesome, so I'm trying to figure out what is the neat way to use them all together.

I almost have it, so probably a few hints would be enough to finish this setup. Here's where I'm at.

System

Start with a clean OS, and install system level requirements: R + renv and Python + venv. For example on Ubuntu it would be approximatelly like that:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8
Project

Now create a bare bones project jupyrenv with two files:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11

DESCRIPTION contains R dependencies:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14

requirements.txt contains python dependencies:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15

Create virtual environments and install dependencies (order matters, R has to follow python):

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24

Very neat so far!

Jupyter

launch jupyter from the command line and rejoice, it works!

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25

enter image description here

What's not to like?

Unfortunatelly, if I create a folder (say notebooks) and launch an R notebook there, it does not work :(

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31
Attempt to fix

It seems that renv is not used from a subfolder, so we need to hint the R process to use it. I tried to add an extra .Rprofile file the notebooks subfolder:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31jupyrenv/
32├── DESCRIPTION
33├── requirements.txt
34├── renv
35├── venv
36├── notebooks
37│   ├── .Rprofile
38│   └── Untitled.ipynb
39├── .Rprofile
40└── Untitled.ipynb
41

With the following contents:

.Rprofile:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31jupyrenv/
32├── DESCRIPTION
33├── requirements.txt
34├── renv
35├── venv
36├── notebooks
37│   ├── .Rprofile
38│   └── Untitled.ipynb
39├── .Rprofile
40└── Untitled.ipynb
41source("../renv/activate.R")
42

And it kind of works, but not really. First, when trying to create an R notebook in the notebooks directory, it creates a new renv:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31jupyrenv/
32├── DESCRIPTION
33├── requirements.txt
34├── renv
35├── venv
36├── notebooks
37│   ├── .Rprofile
38│   └── Untitled.ipynb
39├── .Rprofile
40└── Untitled.ipynb
41source("../renv/activate.R")
42[I 2022-02-23 19:22:28.986 ServerApp] Creating new notebook in /notebooks
43[I 2022-02-23 19:22:29.298 ServerApp] Kernel started: b40a88b3-b0bb-4839-af45-85811ec3073c
44# Bootstrapping renv 0.15.2 --------------------------------------------------
45* Downloading renv 0.15.2 ... OK (downloaded source)
46* Installing renv 0.15.2 ... Done!
47* Successfully installed and loaded renv 0.15.2.
48

Then that instance of jupyter works, and I can use it, but if I restart, it stops working and get's back to the missing IRkernel error:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31jupyrenv/
32├── DESCRIPTION
33├── requirements.txt
34├── renv
35├── venv
36├── notebooks
37│   ├── .Rprofile
38│   └── Untitled.ipynb
39├── .Rprofile
40└── Untitled.ipynb
41source("../renv/activate.R")
42[I 2022-02-23 19:22:28.986 ServerApp] Creating new notebook in /notebooks
43[I 2022-02-23 19:22:29.298 ServerApp] Kernel started: b40a88b3-b0bb-4839-af45-85811ec3073c
44# Bootstrapping renv 0.15.2 --------------------------------------------------
45* Downloading renv 0.15.2 ... OK (downloaded source)
46* Installing renv 0.15.2 ... Done!
47* Successfully installed and loaded renv 0.15.2.
48[I 2022-02-23 19:24:58.912 ServerApp] Kernel started: 822d9372-47fd-43f5-8ac7-77895ef124dc
49Error in loadNamespace(name) : there is no package called ‘IRkernel’
50Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
51

What am I missing?

ANSWER

Answered 2022-Feb-24 at 20:06

I opened this question as an issue in the renv github repo, and maintainers kindly provided a workaround. The contents of the notebooks/.Rprofile should be as follows:

1# R
2sudo apt install r-base
3sudo R -e "install.packages('renv')"
4
5# Python
6sudo apt install python3.8
7sudo apt install python3.8-venv
8jupyrenv/
9├── DESCRIPTION
10└── requirements.txt
11Suggests:
12    IRkernel,
13    fortunes
14jupyterlab
15# Python
16python3.8 -m venv venv
17source venv/bin/activate
18pip install -r requirements.txt
19
20# R
21R -e "renv::init(bare=TRUE)"
22R -e "renv::install()"
23R -e "IRkernel::installspec()"
24jupyter-lab
25[I 2022-02-23 19:07:24.628 ServerApp] Creating new directory in 
26[I 2022-02-23 19:07:31.159 ServerApp] Creating new notebook in /notebooks
27[I 2022-02-23 19:07:31.416 ServerApp] Kernel started: 0aa2c276-18dc-4511-b308-e78234fa71d4
28Error in loadNamespace(name) : there is no package called ‘IRkernel’
29Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
30Execution halted
31jupyrenv/
32├── DESCRIPTION
33├── requirements.txt
34├── renv
35├── venv
36├── notebooks
37│   ├── .Rprofile
38│   └── Untitled.ipynb
39├── .Rprofile
40└── Untitled.ipynb
41source("../renv/activate.R")
42[I 2022-02-23 19:22:28.986 ServerApp] Creating new notebook in /notebooks
43[I 2022-02-23 19:22:29.298 ServerApp] Kernel started: b40a88b3-b0bb-4839-af45-85811ec3073c
44# Bootstrapping renv 0.15.2 --------------------------------------------------
45* Downloading renv 0.15.2 ... OK (downloaded source)
46* Installing renv 0.15.2 ... Done!
47* Successfully installed and loaded renv 0.15.2.
48[I 2022-02-23 19:24:58.912 ServerApp] Kernel started: 822d9372-47fd-43f5-8ac7-77895ef124dc
49Error in loadNamespace(name) : there is no package called ‘IRkernel’
50Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
51owd <- setwd(".."); source("renv/activate.R"); setwd(owd)
52

It blends! 🎉

Source https://stackoverflow.com/questions/71242328

QUESTION

What does runif() mean when used inside if_else()?

Asked 2022-Feb-17 at 00:53

Can you help me interpret this code? I am specifically confused about three arguments inside if_else: runif (n()) < 0.1, NA_character_, as.character(cut).

1diamonds %&gt;%
2  mutate(cut = if_else(runif(n()) &lt; 0.1, NA_character_, as.character(cut))) %&gt;%
3  ggplot() +
4  geom_bar(mapping = aes(x = cut)).
5

source: R for Data Science

ANSWER

Answered 2022-Feb-17 at 00:53

I'll assume you understand everything outside of the contents of the mutate call. As others have suggested in the comments, you can find documentation for any of these functions using the ?function syntax.

dplyr::mutate() is being used here to add a new column, "cut", to the diamonds dataframe, which will replace the old "cut" column:

1diamonds %&gt;%
2  mutate(cut = if_else(runif(n()) &lt; 0.1, NA_character_, as.character(cut))) %&gt;%
3  ggplot() +
4  geom_bar(mapping = aes(x = cut)).
5cut = ifelse(runif(n)) &lt; 0.1, NA_character_, as.character(cut))
6
ifelse()

ifelse is function that requires three arguments: The first is a conditional ("test"), the second is the value to return if the conditional is true ("yes"), and the third is the value to return if the conditional is false ("no"). Its main advantage over a standard 'if statement' is that it can be vectorised. For example:

1diamonds %&gt;%
2  mutate(cut = if_else(runif(n()) &lt; 0.1, NA_character_, as.character(cut))) %&gt;%
3  ggplot() +
4  geom_bar(mapping = aes(x = cut)).
5cut = ifelse(runif(n)) &lt; 0.1, NA_character_, as.character(cut))
6ifelse(test = c(1,2,3) &lt; 3, yes = &quot;less than three&quot;, no = &quot;more than two&quot;)
7# [1] &quot;less than three&quot; &quot;less than three&quot; &quot;more than two&quot;
8
runif()

stats::runif() is a function that generates random numbers between default values of 0 and 1. "runif" is short for "random uniform (number)". Its first argument, "n" is the number of numbers to generate. For example:

1diamonds %&gt;%
2  mutate(cut = if_else(runif(n()) &lt; 0.1, NA_character_, as.character(cut))) %&gt;%
3  ggplot() +
4  geom_bar(mapping = aes(x = cut)).
5cut = ifelse(runif(n)) &lt; 0.1, NA_character_, as.character(cut))
6ifelse(test = c(1,2,3) &lt; 3, yes = &quot;less than three&quot;, no = &quot;more than two&quot;)
7# [1] &quot;less than three&quot; &quot;less than three&quot; &quot;more than two&quot;
8## set random seed for reproducible results
9set.seed(1)
10## generate 5 random numbers
11runif(5)
12# [1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
13
n()

dplyr::n() is a function that can only be used within calls to mutate(), summarise() and filter(). It returns the number of observations within the current group. Assuming that your data is ungrouped, this will be equivalent to nrow(diamonds)

NA_character_

It's not obvious, but there are different types of NA value within R. NA values are normally coerced to the correct type, but in some operations (presumably including this one) it is necessary to specify the type of NA that is required. NA_character_ just means a missing character value. Other, similar reserved names in R include NA_integer_ and NA_real_.

as.character(cut)

The "cut" data within the diamonds data frame is an ordered factor with five levels. The values of ordered factors are actually integers, each of which pertains to a string stored within the levels attribute of the factor. as.character is a generic function, which means it does slightly different things depending on its input. When the input of as.character is a factor, as.character returns the levels of the factor as a character vector. This sounds complicated, but in practise it's very intuitive:

1diamonds %&gt;%
2  mutate(cut = if_else(runif(n()) &lt; 0.1, NA_character_, as.character(cut))) %&gt;%
3  ggplot() +
4  geom_bar(mapping = aes(x = cut)).
5cut = ifelse(runif(n)) &lt; 0.1, NA_character_, as.character(cut))
6ifelse(test = c(1,2,3) &lt; 3, yes = &quot;less than three&quot;, no = &quot;more than two&quot;)
7# [1] &quot;less than three&quot; &quot;less than three&quot; &quot;more than two&quot;
8## set random seed for reproducible results
9set.seed(1)
10## generate 5 random numbers
11runif(5)
12# [1] 0.2655087 0.3721239 0.5728534 0.9082078 0.2016819
13my.factor &lt;- factor(c(&quot;level 1&quot;, &quot;level 2&quot;, &quot;level 3&quot;, &quot;level 2&quot;))
14
15## implicitly calling `print.factor`
16my.factor
17# [1] level 1 level 2 level 3 level 2
18# Levels: level 1 level 2 level 3
19
20## peeking under the hood
21unclass(my.factor)
22# [1] 1 2 3 2
23# attr(,&quot;levels&quot;)
24# [1] &quot;level 1&quot; &quot;level 2&quot; &quot;level 3&quot;
25
26## `as.character` returns the levels pertaining to each element
27as.character(my.factor)
28# [1] &quot;level 1&quot; &quot;level 2&quot; &quot;level 3&quot; &quot;level 2&quot;
29
Putting it all together

The call to ifelse achieves the following:

Generate a vector of random numbers between zero and one whose length is equivalent to the number of rows in the 'diamonds' dataframe. For each of these random numbers, do the following: If the random number is less than 0.1, return a missing character value (NA_character_). Otherwise, return the level-name of the corresponding element of diamonds$cut.

The call to mutate simply overwrites the previous diamonds$cut (used in the calculation) with this new character vector.

Source https://stackoverflow.com/questions/71150807

QUESTION

Create new boolean fields based on specific bigrams appearing in a tokenized pandas dataframe

Asked 2022-Feb-16 at 20:47

Looping over a list of bigrams to search for, I need to create a boolean field for each bigram according to whether or not it is present in a tokenized pandas series. And I'd appreciate an upvote if you think this is a good question!

List of bigrams:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2

Dataframe:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2df = pd.DataFrame(data={'job_description': [['data', 'science', 'degree', 'expert'],
3                                            ['computer', 'science', 'degree', 'masters'],
4                                            ['bachelors', 'degree', 'computer', 'vision'],
5                                            ['data', 'processing', 'science']]})
6

Desired Output:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2df = pd.DataFrame(data={'job_description': [['data', 'science', 'degree', 'expert'],
3                                            ['computer', 'science', 'degree', 'masters'],
4                                            ['bachelors', 'degree', 'computer', 'vision'],
5                                            ['data', 'processing', 'science']]})
6                         job_description  data science computer science bachelors degree
70        [data, science, degree, expert]          True            False            False
81   [computer, science, degree, masters]         False             True            False
92  [bachelors, degree, computer, vision]         False            False             True
103             [data, bachelors, science]         False            False            False
11

Criteria:

  1. Only exact matches should be replaced (for example, flagging for 'data science' should return True for 'data science' but False for 'science data' or 'data bachelors science')
  2. Each search term should get it's own field and be concatenated to the original df

What I've tried:

Failed: df = [x for x in df['job_description'] if x in bigrams]

Failed: df[bigrams] = [[any(w==term for w in lst) for term in bigrams] for lst in df['job_description']]

Failed: Could not adapt the approach here -> Match trigrams, bigrams, and unigrams to a text; if unigram or bigram a substring of already matched trigram, pass; python

Failed: Could not get this one to adapt, either -> Compare two bigrams lists and return the matching bigram

Failed: This method is very close, but couldn't adapt it to bigrams -> Create new boolean fields based on specific terms appearing in a tokenized pandas dataframe

Thanks for any help you can provide!

ANSWER

Answered 2022-Feb-16 at 20:28

You could use a regex and extractall:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2df = pd.DataFrame(data={'job_description': [['data', 'science', 'degree', 'expert'],
3                                            ['computer', 'science', 'degree', 'masters'],
4                                            ['bachelors', 'degree', 'computer', 'vision'],
5                                            ['data', 'processing', 'science']]})
6                         job_description  data science computer science bachelors degree
70        [data, science, degree, expert]          True            False            False
81   [computer, science, degree, masters]         False             True            False
92  [bachelors, degree, computer, vision]         False            False             True
103             [data, bachelors, science]         False            False            False
11regex = '|'.join('(%s)' % b.replace(' ', r'\s+') for b in bigrams)
12matches = (df['job_description'].apply(' '.join)
13           .str.extractall(regex).droplevel(1).notna()
14           .groupby(level=0).max()
15           )
16matches.columns = bigrams
17
18out = df.join(matches).fillna(False)
19

output:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2df = pd.DataFrame(data={'job_description': [['data', 'science', 'degree', 'expert'],
3                                            ['computer', 'science', 'degree', 'masters'],
4                                            ['bachelors', 'degree', 'computer', 'vision'],
5                                            ['data', 'processing', 'science']]})
6                         job_description  data science computer science bachelors degree
70        [data, science, degree, expert]          True            False            False
81   [computer, science, degree, masters]         False             True            False
92  [bachelors, degree, computer, vision]         False            False             True
103             [data, bachelors, science]         False            False            False
11regex = '|'.join('(%s)' % b.replace(' ', r'\s+') for b in bigrams)
12matches = (df['job_description'].apply(' '.join)
13           .str.extractall(regex).droplevel(1).notna()
14           .groupby(level=0).max()
15           )
16matches.columns = bigrams
17
18out = df.join(matches).fillna(False)
19                         job_description  data science  computer science  bachelors degree
200        [data, science, degree, expert]          True             False             False
211   [computer, science, degree, masters]         False              True             False
222  [bachelors, degree, computer, vision]         False             False              True
233            [data, processing, science]         False             False             False
24

generated regex:

1bigrams = ['data science', 'computer science', 'bachelors degree']
2df = pd.DataFrame(data={'job_description': [['data', 'science', 'degree', 'expert'],
3                                            ['computer', 'science', 'degree', 'masters'],
4                                            ['bachelors', 'degree', 'computer', 'vision'],
5                                            ['data', 'processing', 'science']]})
6                         job_description  data science computer science bachelors degree
70        [data, science, degree, expert]          True            False            False
81   [computer, science, degree, masters]         False             True            False
92  [bachelors, degree, computer, vision]         False            False             True
103             [data, bachelors, science]         False            False            False
11regex = '|'.join('(%s)' % b.replace(' ', r'\s+') for b in bigrams)
12matches = (df['job_description'].apply(' '.join)
13           .str.extractall(regex).droplevel(1).notna()
14           .groupby(level=0).max()
15           )
16matches.columns = bigrams
17
18out = df.join(matches).fillna(False)
19                         job_description  data science  computer science  bachelors degree
200        [data, science, degree, expert]          True             False             False
211   [computer, science, degree, masters]         False              True             False
222  [bachelors, degree, computer, vision]         False             False              True
233            [data, processing, science]         False             False             False
24'(data\\s+science)|(computer\\s+science)|(bachelors\\s+degree)'
25

Source https://stackoverflow.com/questions/71147799

QUESTION

Webpage starts zoomed out on mobile devices

Asked 2022-Jan-15 at 20:33

I have created a website for desktop and mobile, and it has to be responsive. My problem is that when I resize the browser all the content gets zoomed out instead of adapting. I also have an issue with the HTML. why is it only taking up 1/3 of the page according to dev tools and when I add width:1100px to my sections it renders the desktop version, but when I take it away it floats to the left side? Why is this happening?

Images of the problem:

1* {
2     margin: 0;
3     padding: 0;
4     box-sizing: border-box;
5}
6 body {
7     font-family: 'Source Sans Pro', sans-serif;
8     background-color: black;
9     color: white;
10     line-height: 30px;
11}
12 html {
13     width:100%;
14}
15 img {
16     width: 100%;
17}
18 h1 {
19     font-weight: 700;
20     font-size: 44px;
21     margin-bottom: 40px;
22     line-height: 50px;
23}
24 h3 {
25     width: 100%;
26}
27/* header */
28 header {
29     display: flex;
30     background-color: black;
31     height: 80px;
32     min-width: 1100px;
33     justify-content: right;
34     align-items: center;
35     margin-bottom: 50px;
36     border-bottom: 1px solid white;
37}
38 nav ul li {
39     display: inline-block;
40     list-style-type: none;
41     margin-right: 20px;
42}
43 .nav-links{
44     color: white;
45     font-size: 18px;
46}
47/* Banner */
48 .banner {
49     display: flex;
50     justify-content: space-around;
51     align-items: center;
52     min-height: 500px;
53     width: 100%;
54}
55 .banner-text-container {
56     max-width: 30%;
57     font-size: 22px;
58}
59 span {
60     color: #11cc9e;
61}
62 .consultation-link{
63     color: #11cc9e;
64     text-decoration: none;
65     margin-top: 30px;
66     font-weight: 900;
67     display: block;
68     border: 1px solid white;
69     max-width: 40%;
70     text-align: center;
71     padding: 5px;
72}
73 .consultation-link:hover{
74     background-color: #fff;
75}
76/* About */
77 .about {
78     display: flex;
79     justify-content: space-around;
80     align-items: center;
81     min-height: 600px;
82     min-width: 1100px;
83}
84 .about-text-container {
85     max-width: 40%;
86     font-size: 22px;
87     margin-left: 20px;
88}
89 .about-img{
90     width: 400px;
91     margin-right: 22px;
92}
93 .about-title {
94     margin-bottom: 40px;
95}
96 .about-us-link{
97     color: #11cc9e;
98     text-decoration: none;
99     margin-top: 30px;
100     font-weight: 900;
101     display: block;
102     border: 1px solid white;
103     text-align: center;
104     max-width: 25%;
105     padding: 5px;
106}
107 .about-us-link:hover{
108     background-color: #fff;
109}
110/* Join */
111 .join {
112     min-height: 600px;
113     min-width: 1100px;
114     max-width: 100%;
115}
116 .join-header{
117     width: 100%;
118     text-align: center;
119     margin-top: 150px;
120     font-size: 40px;
121}
122 .container-boxes{
123     position: relative;
124     top: 0;
125     bottom: 0;
126     display: flex;
127     flex-wrap: wrap;
128     justify-content: space-evenly;
129     align-items: center;
130     min-height: 500px;
131     min-width: 1100px;
132}
133 .box {
134     position: relative;
135     overflow: hidden;
136     transition: 0.5s;
137     height: 200px;
138     width: 300px;
139}
140 .box:hover{
141     z-index: 1;
142     transform: scale(1.25);
143     box-shadow: 0 25px 40px rgba(0, 0, 0, .5);
144     cursor: pointer;
145}
146 .box .imgBX{
147     position: absolute;
148     top: 0;
149     left: 0;
150     width: 100%;
151     height: 100%;
152}
153 .box .imgBX img{
154     position: absolute;
155     top: 0;
156     left: 0;
157     width: 100%;
158     height: 100%;
159     object-fit: cover;
160}
161 .box .imgBX:before{
162     content: '';
163     position: absolute;
164     top: 0;
165     left: 0;
166     width: 100%;
167     height: 100%;
168     z-index: 1;
169     background: linear-gradient(180deg,rgba(0,0,0.7),#79dbc3);
170     mix-blend-mode: multiply;
171     opacity: 0;
172     transition: 0.5s;
173}
174 .box:hover .imgBX:before {
175     opacity: 1;
176}
177 .box .imgBX img{
178     position: absolute;
179     top: 0;
180     left: 0;
181     width: 100%;
182     height: 100%;
183     object-fit: cover;
184}
185 .content{
186     display: flex;
187     flex-direction: column;
188     text-align: center;
189     position: absolute;
190     top: 20%;
191     bottom: 40%;
192     width: 100%;
193     height: 100%;
194     z-index: 1;
195     padding: 20px;
196     visibility: hidden;
197}
198 .box:hover .content{
199     visibility: visible;
200}
201/* Quote section */
202 .quote-section {
203     display: flex;
204     justify-content: center;
205     max-width: 100%;
206     min-height: 500px;
207     min-width: 1100px;
208}
209 .quote-container {
210     display: flex;
211     flex-direction: column;
212     flex-wrap: wrap;
213     align-items: center;
214     justify-items: center;
215     max-width: 50%;
216     font-size: 22px;
217     text-align: center;
218}
219 .quote {
220     line-height: 90px;
221     font-size: 150px;
222     font-style: italic;
223     color: #11cc9e;
224     text-indent: -37px;
225     font-weight: 600;
226     width: 37px;
227}
228 .quote-img{
229     width: 90px;
230     margin: 40px auto;
231}
232 .person-name{
233     color: #ccc;
234}
235 .person-role{
236     font-size: 17px;
237     color: #ccc;
238}
239/* Footer */
240 footer {
241     text-align: center;
242     margin-top: 100px;
243     padding-top: 50px;
244     max-width: 100%;
245     min-height: 200px;
246     min-width: 1100px;
247     border-top: 1px solid #fff;
248}
1* {
2     margin: 0;
3     padding: 0;
4     box-sizing: border-box;
5}
6 body {
7     font-family: 'Source Sans Pro', sans-serif;
8     background-color: black;
9     color: white;
10     line-height: 30px;
11}
12 html {
13     width:100%;
14}
15 img {
16     width: 100%;
17}
18 h1 {
19     font-weight: 700;
20     font-size: 44px;
21     margin-bottom: 40px;
22     line-height: 50px;
23}
24 h3 {
25     width: 100%;
26}
27/* header */
28 header {
29     display: flex;
30     background-color: black;
31     height: 80px;
32     min-width: 1100px;
33     justify-content: right;
34     align-items: center;
35     margin-bottom: 50px;
36     border-bottom: 1px solid white;
37}
38 nav ul li {
39     display: inline-block;
40     list-style-type: none;
41     margin-right: 20px;
42}
43 .nav-links{
44     color: white;
45     font-size: 18px;
46}
47/* Banner */
48 .banner {
49     display: flex;
50     justify-content: space-around;
51     align-items: center;
52     min-height: 500px;
53     width: 100%;
54}
55 .banner-text-container {
56     max-width: 30%;
57     font-size: 22px;
58}
59 span {
60     color: #11cc9e;
61}
62 .consultation-link{
63     color: #11cc9e;
64     text-decoration: none;
65     margin-top: 30px;
66     font-weight: 900;
67     display: block;
68     border: 1px solid white;
69     max-width: 40%;
70     text-align: center;
71     padding: 5px;
72}
73 .consultation-link:hover{
74     background-color: #fff;
75}
76/* About */
77 .about {
78     display: flex;
79     justify-content: space-around;
80     align-items: center;
81     min-height: 600px;
82     min-width: 1100px;
83}
84 .about-text-container {
85     max-width: 40%;
86     font-size: 22px;
87     margin-left: 20px;
88}
89 .about-img{
90     width: 400px;
91     margin-right: 22px;
92}
93 .about-title {
94     margin-bottom: 40px;
95}
96 .about-us-link{
97     color: #11cc9e;
98     text-decoration: none;
99     margin-top: 30px;
100     font-weight: 900;
101     display: block;
102     border: 1px solid white;
103     text-align: center;
104     max-width: 25%;
105     padding: 5px;
106}
107 .about-us-link:hover{
108     background-color: #fff;
109}
110/* Join */
111 .join {
112     min-height: 600px;
113     min-width: 1100px;
114     max-width: 100%;
115}
116 .join-header{
117     width: 100%;
118     text-align: center;
119     margin-top: 150px;
120     font-size: 40px;
121}
122 .container-boxes{
123     position: relative;
124     top: 0;
125     bottom: 0;
126     display: flex;
127     flex-wrap: wrap;
128     justify-content: space-evenly;
129     align-items: center;
130     min-height: 500px;
131     min-width: 1100px;
132}
133 .box {
134     position: relative;
135     overflow: hidden;
136     transition: 0.5s;
137     height: 200px;
138     width: 300px;
139}
140 .box:hover{
141     z-index: 1;
142     transform: scale(1.25);
143     box-shadow: 0 25px 40px rgba(0, 0, 0, .5);
144     cursor: pointer;
145}
146 .box .imgBX{
147     position: absolute;
148     top: 0;
149     left: 0;
150     width: 100%;
151     height: 100%;
152}
153 .box .imgBX img{
154     position: absolute;
155     top: 0;
156     left: 0;
157     width: 100%;
158     height: 100%;
159     object-fit: cover;
160}
161 .box .imgBX:before{
162     content: '';
163     position: absolute;
164     top: 0;
165     left: 0;
166     width: 100%;
167     height: 100%;
168     z-index: 1;
169     background: linear-gradient(180deg,rgba(0,0,0.7),#79dbc3);
170     mix-blend-mode: multiply;
171     opacity: 0;
172     transition: 0.5s;
173}
174 .box:hover .imgBX:before {
175     opacity: 1;
176}
177 .box .imgBX img{
178     position: absolute;
179     top: 0;
180     left: 0;
181     width: 100%;
182     height: 100%;
183     object-fit: cover;
184}
185 .content{
186     display: flex;
187     flex-direction: column;
188     text-align: center;
189     position: absolute;
190     top: 20%;
191     bottom: 40%;
192     width: 100%;
193     height: 100%;
194     z-index: 1;
195     padding: 20px;
196     visibility: hidden;
197}
198 .box:hover .content{
199     visibility: visible;
200}
201/* Quote section */
202 .quote-section {
203     display: flex;
204     justify-content: center;
205     max-width: 100%;
206     min-height: 500px;
207     min-width: 1100px;
208}
209 .quote-container {
210     display: flex;
211     flex-direction: column;
212     flex-wrap: wrap;
213     align-items: center;
214     justify-items: center;
215     max-width: 50%;
216     font-size: 22px;
217     text-align: center;
218}
219 .quote {
220     line-height: 90px;
221     font-size: 150px;
222     font-style: italic;
223     color: #11cc9e;
224     text-indent: -37px;
225     font-weight: 600;
226     width: 37px;
227}
228 .quote-img{
229     width: 90px;
230     margin: 40px auto;
231}
232 .person-name{
233     color: #ccc;
234}
235 .person-role{
236     font-size: 17px;
237     color: #ccc;
238}
239/* Footer */
240 footer {
241     text-align: center;
242     margin-top: 100px;
243     padding-top: 50px;
244     max-width: 100%;
245     min-height: 200px;
246     min-width: 1100px;
247     border-top: 1px solid #fff;
248}&lt;!DOCTYPE html&gt;
249&lt;html lang="en"&gt;
250   &lt;head&gt;
251      &lt;title&gt;Codes&lt;/title&gt;
252      &lt;link rel="preconnect" href="https://fonts.googleapis.com"&gt;
253      &lt;ink rel="preconnect" href="https://fonts.gstatic.com" crossorigin&gt;
254      &lt;link href="https://fonts.googleapis.com/css2?family=Source+Sans+Pro:wght@400;600&amp;display=swap" rel="stylesheet"&gt;
255      &lt;meta charset="UTF-8"&gt;
256      &lt;meta http-equiv="X-UA-Compatible" content="IE=edge"&gt;
257      &lt;meta name="viewport" content="width=device-width, initial-scale=1"&gt;
258      &lt;link rel="stylesheet" href="./Resources/styles.css"&gt;
259   &lt;/head&gt;
260   &lt;body&gt;
261      &lt;header&gt;
262         &lt;!-- insert logo --&gt;
263         &lt;nav class="nav-links"&gt;
264            &lt;ul&gt;
265               &lt;li&gt;About&lt;/li&gt;
266               &lt;li&gt;Peer group&lt;/li&gt;
267               &lt;li&gt;Review&lt;/li&gt;
268            &lt;/ul&gt;
269         &lt;/nav&gt;
270      &lt;/header&gt;
271      &lt;section class="banner"&gt;
272         &lt;div class="banner-text-container"&gt;
273            &lt;h1&gt;Build. Grow. &lt;span class="color-Learn"&gt;Learn.&lt;/span&gt;&lt;/h1&gt;
274            &lt;p&gt;Unlock your potential with your peers!, using Blockchain, Fintech or the IT outsourcing company Boosty Labs helps you create an innovative end to end product or augment your team with the right experts.&lt;/p&gt;
275            &lt;a class="consultation-link" href="#"&gt;Free consultation &lt;/a&gt;
276         &lt;/div&gt;
277         &lt;div class="banner-img"&gt;
278            &lt;img src="./Resources/Images/banner.png" alt=""&gt;
279         &lt;/div&gt;
280      &lt;/section&gt;
281      &lt;section class="about"&gt;
282         &lt;div class="about-text-container"&gt;
283            &lt;h2 class="about-title"&gt;Who we are&lt;/h2&gt;
284            &lt;p&gt;Here you can find our ,collection of coding, data science and statistics tutorials with examples in R, Python, JavaScript and Python. As you click through, you'll notice that some tutorials have ribbons on their logos - they are part of our free and self-paced online course Data Science for Ecologists and Environmental Scientists! Yellow for the Stats from Scratch stream, blue for Wiz of Data Viz and purple for Mastering Modelling.&lt;/p&gt;
285            &lt;a class="about-us-link" href="#"&gt;More about us &lt;/a&gt;
286         &lt;/div&gt;
287         &lt;div class="about-img"&gt;
288            &lt;img src="./Resources/Images/whoweare.png" alt=""&gt;
289         &lt;/div&gt;
290      &lt;/section&gt;
291      &lt;section class="join"&gt;
292         &lt;h3 class="join-header" &gt;Join a peer group!&lt;/h3&gt;
293         &lt;div class="container-boxes"&gt;
294            &lt;div class="box"&gt;
295               &lt;div class="imgBX"&gt; 
296                  &lt;img src="./Resources/Images/box-1.png" alt=""&gt;
297               &lt;/div&gt;
298               &lt;div class="content"&gt;
299                  &lt;h3&gt;AI&lt;/h3&gt;
300                  &lt;P&gt;Discover The Complete Range Of Artificial Intelligence Solutions.&lt;/P&gt;
301               &lt;/div&gt;
302            &lt;/div&gt;
303            &lt;div class="box"&gt;
304               &lt;div class="imgBX"&gt;
305                  &lt;img src="./Resources/Images/box-2.png" alt=""&gt;
306               &lt;/div&gt;
307               &lt;div class="content"&gt;
308                  &lt;h3 class="frontend-title"&gt;Frontend Dev&lt;/h3&gt;
309                  &lt;p&gt;Discover The Complete Range Of Frontend Solutions.&lt;/p&gt;
310               &lt;/div&gt;
311            &lt;/div&gt;
312            &lt;div class="box"&gt;
313               &lt;div class="imgBX"&gt;
314                  &lt;img src="./Resources/Images/box-3.png" alt=""&gt;
315               &lt;/div&gt;
316               &lt;div class="content"&gt;
317                  &lt;h3&gt;Microsoft systems&lt;/h3&gt;
318                  &lt;p&gt;Discover The Complete Range Of Microsoft Solutions.&lt;/p&gt;
319               &lt;/div&gt;
320            &lt;/div&gt;
321         &lt;/div&gt;
322      &lt;/section&gt;
323      &lt;section class="quote-section"&gt;
324         &lt;div class="quote-container"&gt;
325            &lt;div class="quote"&gt;"&lt;/div&gt;
326            &lt;p class="p-quote"&gt;In coded, the progress of the topics and the exercises are really good. It's so nice to practice on good story told tasks. Also if you are stuck, it is nice to have a broad range of coders around in the peer groups  that you can get the answers you are looking for.&lt;/p&gt;
327            &lt;div class="quote-img"&gt;
328               &lt;img src="./Resources/Images/person-img.png" alt=""&gt;
329            &lt;/div&gt;
330            &lt;div class="person-name"&gt;Peter Gangland &lt;/div&gt;
331            &lt;div class="person-role"&gt;Director of business dev at &lt;span&gt;Microsoft&lt;/span&gt;&lt;/div&gt;
332         &lt;/div&gt;
333      &lt;/section&gt;
334      &lt;footer&gt;
335         &lt;div id="contact"&gt;
336            &lt;h2&gt;
337            Contact us&lt;/h5&gt;
338            &lt;h5&gt;coded@peers.com&lt;/h5&gt;
339            &lt;h5&gt;831-867-5309&lt;/h5&gt;
340         &lt;/div&gt;
341         &lt;div id="copyright"&gt;
342            &lt;h5&gt;@copyright coded Enterprises 2022&lt;/h5&gt;
343         &lt;/div&gt;
344      &lt;/footer&gt;
345   &lt;/body&gt;
346&lt;/html&gt;

ANSWER

Answered 2022-Jan-15 at 19:43

For making your website responsive you need to use media queries. It's like you tell the browser how to style your website in different sizes. I think your problem with your sections might also get solved if you try to make your website responsive.

Source https://stackoverflow.com/questions/70724640

QUESTION

Do random functions such as sample work when I deploy a Shiny App?

Asked 2021-Dec-04 at 16:46

I have created a survey in Shiny and I would like that the options in this survey are always different. To do so, I have used the sample function and each time I run the app on my local machine the possible options for the answers are always different as I want. However, I have recently deployed the app through shinyapps.io and it seems that the there is no more randomness in the possible options. This is the code of my shiny app :

1# Loading the needed libraries 
2library(shiny)
3library(shinythemes)
4library(googlesheets4)
5library(googledrive)
6library(shinyalert)
7
8
9setwd('C:/Users/alber/Desktop/UniTn/Data Science/Third Semester/Laboraotry of Business and Customer analytics/Project_Real')
10
11#gs4_auth(cache = &quot;.secrets&quot;) #for the first time 
12gs4_auth(cache = &quot;.secrets&quot;, email = TRUE, use_oob = TRUE) # when you deploy 
13sheet_id &lt;- &quot;1-l3D2dhWjwv1hWXs97db08pJUKZ3DF1DZ4d4yWAVsik&quot;
14#sheet_id &lt;- &quot;1MdqGpii3hfoG1OcvlAQjbQ171UOwxCR3Qfc8aIKfZIo&quot;
15
16# Let0s define the demographic variables that will constitute the first part
17# of our survey. These infos could be then used for market segmentation
18
19
20platform_type &lt;- c('Web App', 'Desktop App', 'Mobile App')
21deposit_minmax &lt;-  c('min 0€ max 1000€', 'min 10€ max 10000€', 'min 100€ max infinte')
22fees_on_purchases &lt;- c('0%', '0.015%', '0.025%')
23#https://www.investopedia.com/terms/f/financialinstrument.asp
24financial_instruments &lt;-  c('Stocks', 'Crypto', 'ETFs', 'Commodities')
25leverage &lt;-  c('YES', 'NO')
26social_copy &lt;-  c('YES', 'NO')
27n_a &lt;-  5
28# Now that we have defined the attributes and their levels we can implement a function 
29# that creates random profiles
30create_options &lt;-  function(){
31  
32  list_prod &lt;-  c()
33  
34  for(i in 1:1000){
35    # initialize the product profile
36    prod_prof &lt;- c(
37      paste('Platform Type:', sample(platform_type,1), '|',
38            'Amount of Deposit:', sample(deposit_minmax,1), '|',
39            'Fees on buy &amp; sell orders:', sample(fees_on_purchases,1), '|',
40            'Financial Instruments:', sample(financial_instruments,1), '|',
41            'Leverage:', sample(leverage,1), '|', 
42            'Social/Copy Trading', sample(social_copy,1))
43    )
44    # in order to avoid clones
45    if (is.element(prod_prof, list_prod) == FALSE){
46      list_prod &lt;- append(prod_prof, list_prod)
47    }
48  }
49  return  (list_prod)
50}
51
52
53################################################################################
54
55# START DEVELOPING THE APP 
56
57# User Interface
58ui &lt;- fluidPage( 
59  # Theme
60  theme = shinytheme(&quot;cerulean&quot;),
61  # Creating a navigation bar
62  navbarPage( h1('Trading App Survey'),
63    tabPanel(
64    h3('Survey'),
65    
66
67
68    # 1st Question 
69    checkboxGroupInput('Choice1', 'Which product do you prefer ? \n(Please pick ONLY ONE)', sample(create_options(),3, replace = F)),
70    
71  
72    #downloadButton('Results', label = 'Conclude the survye'),
73    useShinyalert(),
74    
75    actionButton(&quot;submit&quot;, &quot;Submit&quot;),
76    
77    
78    
79    
80  ), 
81  tabPanel(h3('Appendix'),
82          
83           h2('Glossary'),
84           
85          
86            )) )
87  
88
89
90
91
92# Define server function  
93server &lt;- function(input, output) {
94  
95  
96  observeEvent(input$submit, {
97    
98   
99    
100    results_s &lt;- data.frame(input$Choice1, )
101    
102    sheet_append(data = results_s, ss = sheet_id, sheet = 'Survey_Answers')
103    
104    shinyalert(&quot;Thank you!&quot;, &quot;Your answers have been collected. You can close the survey&quot;, type = &quot;success&quot;)
105  })
106 
107}
108
109# Create Shiny object
110shinyApp(ui = ui, server = server)
111
112
113

How can I make it works also when I deploy the app ? Thank you in advance !

ANSWER

Answered 2021-Dec-04 at 16:46

This can be fixed by moving the random calculation inside the server function, otherwise, if you execute a random function outside the server, it will work, but it will be the same for all the users. This behaviour is to prevent large calculations from happening unnecessary times if for example, all users will have access to the same data.

Below is the code for the app i deployed in shinyapps.io. Because i need to execute create_options() inside the server, i will use renderUI(). If I use sample() inside any part of the UI it will only be executed once, hence the static options.

Also, i used prettyRadioButtons from shinyWidgets to prevent that users pick more than one option.

code:

1# Loading the needed libraries 
2library(shiny)
3library(shinythemes)
4library(googlesheets4)
5library(googledrive)
6library(shinyalert)
7
8
9setwd('C:/Users/alber/Desktop/UniTn/Data Science/Third Semester/Laboraotry of Business and Customer analytics/Project_Real')
10
11#gs4_auth(cache = &quot;.secrets&quot;) #for the first time 
12gs4_auth(cache = &quot;.secrets&quot;, email = TRUE, use_oob = TRUE) # when you deploy 
13sheet_id &lt;- &quot;1-l3D2dhWjwv1hWXs97db08pJUKZ3DF1DZ4d4yWAVsik&quot;
14#sheet_id &lt;- &quot;1MdqGpii3hfoG1OcvlAQjbQ171UOwxCR3Qfc8aIKfZIo&quot;
15
16# Let0s define the demographic variables that will constitute the first part
17# of our survey. These infos could be then used for market segmentation
18
19
20platform_type &lt;- c('Web App', 'Desktop App', 'Mobile App')
21deposit_minmax &lt;-  c('min 0€ max 1000€', 'min 10€ max 10000€', 'min 100€ max infinte')
22fees_on_purchases &lt;- c('0%', '0.015%', '0.025%')
23#https://www.investopedia.com/terms/f/financialinstrument.asp
24financial_instruments &lt;-  c('Stocks', 'Crypto', 'ETFs', 'Commodities')
25leverage &lt;-  c('YES', 'NO')
26social_copy &lt;-  c('YES', 'NO')
27n_a &lt;-  5
28# Now that we have defined the attributes and their levels we can implement a function 
29# that creates random profiles
30create_options &lt;-  function(){
31  
32  list_prod &lt;-  c()
33  
34  for(i in 1:1000){
35    # initialize the product profile
36    prod_prof &lt;- c(
37      paste('Platform Type:', sample(platform_type,1), '|',
38            'Amount of Deposit:', sample(deposit_minmax,1), '|',
39            'Fees on buy &amp; sell orders:', sample(fees_on_purchases,1), '|',
40            'Financial Instruments:', sample(financial_instruments,1), '|',
41            'Leverage:', sample(leverage,1), '|', 
42            'Social/Copy Trading', sample(social_copy,1))
43    )
44    # in order to avoid clones
45    if (is.element(prod_prof, list_prod) == FALSE){
46      list_prod &lt;- append(prod_prof, list_prod)
47    }
48  }
49  return  (list_prod)
50}
51
52
53################################################################################
54
55# START DEVELOPING THE APP 
56
57# User Interface
58ui &lt;- fluidPage( 
59  # Theme
60  theme = shinytheme(&quot;cerulean&quot;),
61  # Creating a navigation bar
62  navbarPage( h1('Trading App Survey'),
63    tabPanel(
64    h3('Survey'),
65    
66
67
68    # 1st Question 
69    checkboxGroupInput('Choice1', 'Which product do you prefer ? \n(Please pick ONLY ONE)', sample(create_options(),3, replace = F)),
70    
71  
72    #downloadButton('Results', label = 'Conclude the survye'),
73    useShinyalert(),
74    
75    actionButton(&quot;submit&quot;, &quot;Submit&quot;),
76    
77    
78    
79    
80  ), 
81  tabPanel(h3('Appendix'),
82          
83           h2('Glossary'),
84           
85          
86            )) )
87  
88
89
90
91
92# Define server function  
93server &lt;- function(input, output) {
94  
95  
96  observeEvent(input$submit, {
97    
98   
99    
100    results_s &lt;- data.frame(input$Choice1, )
101    
102    sheet_append(data = results_s, ss = sheet_id, sheet = 'Survey_Answers')
103    
104    shinyalert(&quot;Thank you!&quot;, &quot;Your answers have been collected. You can close the survey&quot;, type = &quot;success&quot;)
105  })
106 
107}
108
109# Create Shiny object
110shinyApp(ui = ui, server = server)
111
112
113library(shiny)
114library(tidyverse)
115library(shinythemes)
116# library(googlesheets4)
117# library(googledrive)
118library(shinyalert)
119library(shinyWidgets)
120
121platform_type &lt;- c(&quot;Web App&quot;, &quot;Desktop App&quot;, &quot;Mobile App&quot;)
122deposit_minmax &lt;- c(&quot;min 0€ max 1000€&quot;, &quot;min 10€ max 10000€&quot;, &quot;min 100€ max infinte&quot;)
123fees_on_purchases &lt;- c(&quot;0%&quot;, &quot;0.015%&quot;, &quot;0.025%&quot;)
124# https://www.investopedia.com/terms/f/financialinstrument.asp
125financial_instruments &lt;- c(&quot;Stocks&quot;, &quot;Crypto&quot;, &quot;ETFs&quot;, &quot;Commodities&quot;)
126leverage &lt;- c(&quot;YES&quot;, &quot;NO&quot;)
127social_copy &lt;- c(&quot;YES&quot;, &quot;NO&quot;)
128n_a &lt;- 5
129# Now that we have defined the attributes and their levels we can implement a function
130# that creates random profiles
131create_options &lt;- function() {
132  list_prod &lt;- c()
133
134  for (i in 1:1000) {
135    # initialize the product profile
136    prod_prof &lt;- c(
137      paste(
138        &quot;Platform Type:&quot;, sample(platform_type, 1), &quot;|&quot;,
139        &quot;Amount of Deposit:&quot;, sample(deposit_minmax, 1), &quot;|&quot;,
140        &quot;Fees on buy &amp; sell orders:&quot;, sample(fees_on_purchases, 1), &quot;|&quot;,
141        &quot;Financial Instruments:&quot;, sample(financial_instruments, 1), &quot;|&quot;,
142        &quot;Leverage:&quot;, sample(leverage, 1), &quot;|&quot;,
143        &quot;Social/Copy Trading&quot;, sample(social_copy, 1)
144      )
145    )
146    # in order to avoid clones
147    if (is.element(prod_prof, list_prod) == FALSE) {
148      list_prod &lt;- append(prod_prof, list_prod)
149    }
150  }
151  return(list_prod)
152}
153
154
155
156
157# APP ---------------------------------------------------------------------
158
159
160ui &lt;- fluidPage(
161  # Theme
162  theme = shinytheme(&quot;cerulean&quot;),
163  # Creating a navigation bar
164  navbarPage(
165    h1(&quot;Trading App Survey&quot;),
166    tabPanel(
167      h3(&quot;Survey&quot;),
168
169
170      # 1st Question
171      uiOutput(&quot;random_choices&quot;),
172
173
174      # downloadButton('Results', label = 'Conclude the survye'),
175      useShinyalert(),
176      actionButton(&quot;submit&quot;, &quot;Submit&quot;),
177    )
178  ),
179  tabPanel(
180    h3(&quot;Appendix&quot;),
181    h2(&quot;Glossary&quot;),
182  )
183)
184
185
186server &lt;- function(input, output, session) {
187  output$random_choices &lt;- renderUI(prettyRadioButtons(&quot;Choice1&quot;,
188    &quot;Which product do you prefer ? \n(Please pick ONLY ONE)&quot;,
189    sample(create_options(), 3, replace = F),
190    icon = icon(&quot;check&quot;)
191  ))
192
193  rv &lt;- reactiveValues(df = data.frame(question = NA, answer = NA))
194
195
196  observeEvent(input$submit, {
197    rv$df &lt;- add_row(rv$df, question = &quot;Choice1&quot;, answer = input$Choice1)
198
199    # sheet_append(data = results_s, ss = sheet_id, sheet = 'Survey_Answers')
200
201    shinyalert(&quot;Thank you!&quot;, &quot;Your answers have been collected. You can close the survey&quot;, type = &quot;success&quot;)
202  })
203}
204
205# Create Shiny object
206shinyApp(ui = ui, server = server)
207

enter image description here

Source https://stackoverflow.com/questions/70225150

QUESTION

How can I check a confusion_matrix after fine-tuning with custom datasets?

Asked 2021-Nov-24 at 13:26

This question is the same with How can I check a confusion_matrix after fine-tuning with custom datasets?, on Data Science Stack Exchange.

Background

I would like to check a confusion_matrix, including precision, recall, and f1-score like below after fine-tuning with custom datasets.

Fine tuning process and the task are Sequence Classification with IMDb Reviews on the Fine-tuning with custom datasets tutorial on Hugging face.

After finishing the fine-tune with Trainer, how can I check a confusion_matrix in this case?

An image of confusion_matrix, including precision, recall, and f1-score original site: just for example output image

1predictions = np.argmax(trainer.test(test_x), axis=1)
2
3# Confusion matrix and classification report.
4print(classification_report(test_y, predictions))
5
6            precision    recall  f1-score   support
7
8          0       0.75      0.79      0.77      1000
9          1       0.81      0.87      0.84      1000
10          2       0.63      0.61      0.62      1000
11          3       0.55      0.47      0.50      1000
12          4       0.66      0.66      0.66      1000
13          5       0.62      0.64      0.63      1000
14          6       0.74      0.83      0.78      1000
15          7       0.80      0.74      0.77      1000
16          8       0.85      0.81      0.83      1000
17          9       0.79      0.80      0.80      1000
18
19avg / total       0.72      0.72      0.72     10000
20
Code
1predictions = np.argmax(trainer.test(test_x), axis=1)
2
3# Confusion matrix and classification report.
4print(classification_report(test_y, predictions))
5
6            precision    recall  f1-score   support
7
8          0       0.75      0.79      0.77      1000
9          1       0.81      0.87      0.84      1000
10          2       0.63      0.61      0.62      1000
11          3       0.55      0.47      0.50      1000
12          4       0.66      0.66      0.66      1000
13          5       0.62      0.64      0.63      1000
14          6       0.74      0.83      0.78      1000
15          7       0.80      0.74      0.77      1000
16          8       0.85      0.81      0.83      1000
17          9       0.79      0.80      0.80      1000
18
19avg / total       0.72      0.72      0.72     10000
20from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
21
22training_args = TrainingArguments(
23    output_dir='./results',          # output directory
24    num_train_epochs=3,              # total number of training epochs
25    per_device_train_batch_size=16,  # batch size per device during training
26    per_device_eval_batch_size=64,   # batch size for evaluation
27    warmup_steps=500,                # number of warmup steps for learning rate scheduler
28    weight_decay=0.01,               # strength of weight decay
29    logging_dir='./logs',            # directory for storing logs
30    logging_steps=10,
31)
32
33model = DistilBertForSequenceClassification.from_pretrained(&quot;distilbert-base-uncased&quot;)
34
35trainer = Trainer(
36    model=model,                         # the instantiated 🤗 Transformers model to be trained
37    args=training_args,                  # training arguments, defined above
38    train_dataset=train_dataset,         # training dataset
39    eval_dataset=val_dataset             # evaluation dataset
40)
41
42trainer.train()
43
What I did so far

Data set Preparation for Sequence Classification with IMDb Reviews, and I'm fine-tuning with Trainer.

1predictions = np.argmax(trainer.test(test_x), axis=1)
2
3# Confusion matrix and classification report.
4print(classification_report(test_y, predictions))
5
6            precision    recall  f1-score   support
7
8          0       0.75      0.79      0.77      1000
9          1       0.81      0.87      0.84      1000
10          2       0.63      0.61      0.62      1000
11          3       0.55      0.47      0.50      1000
12          4       0.66      0.66      0.66      1000
13          5       0.62      0.64      0.63      1000
14          6       0.74      0.83      0.78      1000
15          7       0.80      0.74      0.77      1000
16          8       0.85      0.81      0.83      1000
17          9       0.79      0.80      0.80      1000
18
19avg / total       0.72      0.72      0.72     10000
20from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
21
22training_args = TrainingArguments(
23    output_dir='./results',          # output directory
24    num_train_epochs=3,              # total number of training epochs
25    per_device_train_batch_size=16,  # batch size per device during training
26    per_device_eval_batch_size=64,   # batch size for evaluation
27    warmup_steps=500,                # number of warmup steps for learning rate scheduler
28    weight_decay=0.01,               # strength of weight decay
29    logging_dir='./logs',            # directory for storing logs
30    logging_steps=10,
31)
32
33model = DistilBertForSequenceClassification.from_pretrained(&quot;distilbert-base-uncased&quot;)
34
35trainer = Trainer(
36    model=model,                         # the instantiated 🤗 Transformers model to be trained
37    args=training_args,                  # training arguments, defined above
38    train_dataset=train_dataset,         # training dataset
39    eval_dataset=val_dataset             # evaluation dataset
40)
41
42trainer.train()
43from pathlib import Path
44
45def read_imdb_split(split_dir):
46    split_dir = Path(split_dir)
47    texts = []
48    labels = []
49    for label_dir in [&quot;pos&quot;, &quot;neg&quot;]:
50        for text_file in (split_dir/label_dir).iterdir():
51            texts.append(text_file.read_text())
52            labels.append(0 if label_dir is &quot;neg&quot; else 1)
53
54    return texts, labels
55
56train_texts, train_labels = read_imdb_split('aclImdb/train')
57test_texts, test_labels = read_imdb_split('aclImdb/test')
58
59from sklearn.model_selection import train_test_split
60train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)
61
62from transformers import DistilBertTokenizerFast
63tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
64
65train_encodings = tokenizer(train_texts, truncation=True, padding=True)
66val_encodings = tokenizer(val_texts, truncation=True, padding=True)
67test_encodings = tokenizer(test_texts, truncation=True, padding=True)
68
69import torch
70
71class IMDbDataset(torch.utils.data.Dataset):
72    def __init__(self, encodings, labels):
73        self.encodings = encodings
74        self.labels = labels
75
76    def __getitem__(self, idx):
77        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
78        item['labels'] = torch.tensor(self.labels[idx])
79        return item
80
81    def __len__(self):
82        return len(self.labels)
83
84train_dataset = IMDbDataset(train_encodings, train_labels)
85val_dataset = IMDbDataset(val_encodings, val_labels)
86test_dataset = IMDbDataset(test_encodings, test_labels)
87

ANSWER

Answered 2021-Nov-24 at 13:26

What you could do in this situation is to iterate on the validation set(or on the test set for that matter) and manually create a list of y_true and y_pred.

1predictions = np.argmax(trainer.test(test_x), axis=1)
2
3# Confusion matrix and classification report.
4print(classification_report(test_y, predictions))
5
6            precision    recall  f1-score   support
7
8          0       0.75      0.79      0.77      1000
9          1       0.81      0.87      0.84      1000
10          2       0.63      0.61      0.62      1000
11          3       0.55      0.47      0.50      1000
12          4       0.66      0.66      0.66      1000
13          5       0.62      0.64      0.63      1000
14          6       0.74      0.83      0.78      1000
15          7       0.80      0.74      0.77      1000
16          8       0.85      0.81      0.83      1000
17          9       0.79      0.80      0.80      1000
18
19avg / total       0.72      0.72      0.72     10000
20from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
21
22training_args = TrainingArguments(
23    output_dir='./results',          # output directory
24    num_train_epochs=3,              # total number of training epochs
25    per_device_train_batch_size=16,  # batch size per device during training
26    per_device_eval_batch_size=64,   # batch size for evaluation
27    warmup_steps=500,                # number of warmup steps for learning rate scheduler
28    weight_decay=0.01,               # strength of weight decay
29    logging_dir='./logs',            # directory for storing logs
30    logging_steps=10,
31)
32
33model = DistilBertForSequenceClassification.from_pretrained(&quot;distilbert-base-uncased&quot;)
34
35trainer = Trainer(
36    model=model,                         # the instantiated 🤗 Transformers model to be trained
37    args=training_args,                  # training arguments, defined above
38    train_dataset=train_dataset,         # training dataset
39    eval_dataset=val_dataset             # evaluation dataset
40)
41
42trainer.train()
43from pathlib import Path
44
45def read_imdb_split(split_dir):
46    split_dir = Path(split_dir)
47    texts = []
48    labels = []
49    for label_dir in [&quot;pos&quot;, &quot;neg&quot;]:
50        for text_file in (split_dir/label_dir).iterdir():
51            texts.append(text_file.read_text())
52            labels.append(0 if label_dir is &quot;neg&quot; else 1)
53
54    return texts, labels
55
56train_texts, train_labels = read_imdb_split('aclImdb/train')
57test_texts, test_labels = read_imdb_split('aclImdb/test')
58
59from sklearn.model_selection import train_test_split
60train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)
61
62from transformers import DistilBertTokenizerFast
63tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
64
65train_encodings = tokenizer(train_texts, truncation=True, padding=True)
66val_encodings = tokenizer(val_texts, truncation=True, padding=True)
67test_encodings = tokenizer(test_texts, truncation=True, padding=True)
68
69import torch
70
71class IMDbDataset(torch.utils.data.Dataset):
72    def __init__(self, encodings, labels):
73        self.encodings = encodings
74        self.labels = labels
75
76    def __getitem__(self, idx):
77        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
78        item['labels'] = torch.tensor(self.labels[idx])
79        return item
80
81    def __len__(self):
82        return len(self.labels)
83
84train_dataset = IMDbDataset(train_encodings, train_labels)
85val_dataset = IMDbDataset(val_encodings, val_labels)
86test_dataset = IMDbDataset(test_encodings, test_labels)
87import torch
88import torch.nn.functional as F
89from sklearn import metrics
90 
91y_preds = []
92y_trues = []
93for index,val_text in enumerate(val_texts):
94     tokenized_val_text = tokenizer([val_text], 
95                                    truncation=True,
96                                    padding=True,
97                                    return_tensor='pt')
98     logits = model(tokenized_val_text)
99     prediction = F.softmax(logits, dim=1)
100     y_pred = torch.argmax(prediction).numpy()
101     y_true = val_labels[index]
102     y_preds.append(y_pred)
103     y_trues.append(y_true)
104

Finally,

1predictions = np.argmax(trainer.test(test_x), axis=1)
2
3# Confusion matrix and classification report.
4print(classification_report(test_y, predictions))
5
6            precision    recall  f1-score   support
7
8          0       0.75      0.79      0.77      1000
9          1       0.81      0.87      0.84      1000
10          2       0.63      0.61      0.62      1000
11          3       0.55      0.47      0.50      1000
12          4       0.66      0.66      0.66      1000
13          5       0.62      0.64      0.63      1000
14          6       0.74      0.83      0.78      1000
15          7       0.80      0.74      0.77      1000
16          8       0.85      0.81      0.83      1000
17          9       0.79      0.80      0.80      1000
18
19avg / total       0.72      0.72      0.72     10000
20from transformers import DistilBertForSequenceClassification, Trainer, TrainingArguments
21
22training_args = TrainingArguments(
23    output_dir='./results',          # output directory
24    num_train_epochs=3,              # total number of training epochs
25    per_device_train_batch_size=16,  # batch size per device during training
26    per_device_eval_batch_size=64,   # batch size for evaluation
27    warmup_steps=500,                # number of warmup steps for learning rate scheduler
28    weight_decay=0.01,               # strength of weight decay
29    logging_dir='./logs',            # directory for storing logs
30    logging_steps=10,
31)
32
33model = DistilBertForSequenceClassification.from_pretrained(&quot;distilbert-base-uncased&quot;)
34
35trainer = Trainer(
36    model=model,                         # the instantiated 🤗 Transformers model to be trained
37    args=training_args,                  # training arguments, defined above
38    train_dataset=train_dataset,         # training dataset
39    eval_dataset=val_dataset             # evaluation dataset
40)
41
42trainer.train()
43from pathlib import Path
44
45def read_imdb_split(split_dir):
46    split_dir = Path(split_dir)
47    texts = []
48    labels = []
49    for label_dir in [&quot;pos&quot;, &quot;neg&quot;]:
50        for text_file in (split_dir/label_dir).iterdir():
51            texts.append(text_file.read_text())
52            labels.append(0 if label_dir is &quot;neg&quot; else 1)
53
54    return texts, labels
55
56train_texts, train_labels = read_imdb_split('aclImdb/train')
57test_texts, test_labels = read_imdb_split('aclImdb/test')
58
59from sklearn.model_selection import train_test_split
60train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)
61
62from transformers import DistilBertTokenizerFast
63tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
64
65train_encodings = tokenizer(train_texts, truncation=True, padding=True)
66val_encodings = tokenizer(val_texts, truncation=True, padding=True)
67test_encodings = tokenizer(test_texts, truncation=True, padding=True)
68
69import torch
70
71class IMDbDataset(torch.utils.data.Dataset):
72    def __init__(self, encodings, labels):
73        self.encodings = encodings
74        self.labels = labels
75
76    def __getitem__(self, idx):
77        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
78        item['labels'] = torch.tensor(self.labels[idx])
79        return item
80
81    def __len__(self):
82        return len(self.labels)
83
84train_dataset = IMDbDataset(train_encodings, train_labels)
85val_dataset = IMDbDataset(val_encodings, val_labels)
86test_dataset = IMDbDataset(test_encodings, test_labels)
87import torch
88import torch.nn.functional as F
89from sklearn import metrics
90 
91y_preds = []
92y_trues = []
93for index,val_text in enumerate(val_texts):
94     tokenized_val_text = tokenizer([val_text], 
95                                    truncation=True,
96                                    padding=True,
97                                    return_tensor='pt')
98     logits = model(tokenized_val_text)
99     prediction = F.softmax(logits, dim=1)
100     y_pred = torch.argmax(prediction).numpy()
101     y_true = val_labels[index]
102     y_preds.append(y_pred)
103     y_trues.append(y_true)
104confusion_matrix = metrics.confusion_matrix(y_trues, y_preds, labels=[&quot;neg&quot;, &quot;pos&quot;]))
105print(confusion_matrix)
106

Observations:

  1. The output of the model are the logits, not the probabilities normalized.
  2. As such, we apply softmax on dimension one to transform to actual probabilities (e.g. 0.2% class 0, 0.8% class 1).
  3. We apply the .argmax() operation to get the index of the class.

Source https://stackoverflow.com/questions/68691450

QUESTION

How to rewrite this deprecated expression using do and &quot;by&quot;, with &quot;groupby&quot; (Julia)

Asked 2021-Oct-31 at 20:24

The goal is to generate fake data.

We generate a set of parameters,

1## Simulated data
2df_3 = DataFrame(y = [0,1], size = [250,250], x1 =[2.,0.], x2 =[-1.,-2.])
3

Now, I want to generate the fake data per se,

1## Simulated data
2df_3 = DataFrame(y = [0,1], size = [250,250], x1 =[2.,0.], x2 =[-1.,-2.])
3df_knn =by(df_3, :y) do df
4  DataFrame(x_1 = rand(Normal(df[1,:x1],1), df[1,:size]),
5  x_2 = rand(Normal(df[1,:x2],1), df[1,:size]))
6end
7

How I can replace by with groupby, here?

SOURCE: This excerpt is from the book, Data Science with Julia (2019).

ANSWER

Answered 2021-Oct-31 at 20:24

I think this is what you mean here:

1## Simulated data
2df_3 = DataFrame(y = [0,1], size = [250,250], x1 =[2.,0.], x2 =[-1.,-2.])
3df_knn =by(df_3, :y) do df
4  DataFrame(x_1 = rand(Normal(df[1,:x1],1), df[1,:size]),
5  x_2 = rand(Normal(df[1,:x2],1), df[1,:size]))
6end
7julia&gt; combine(groupby(df_3, :y)) do df
8         DataFrame(x_1 = rand(Normal(df[1,:x1],1), df[1,:size]), 
9                   x_2 = rand(Normal(df[1,:x2],1), df[1,:size]))
10       end
11500×3 DataFrame
12 Row │ y      x_1        x_2
13     │ Int64  Float64    Float64
14─────┼─────────────────────────────
15   1 │     0   1.88483    0.890807
16   2 │     0   2.50124   -0.280708
17   3 │     0   1.1857     0.823002
18  ⋮  │   ⋮        ⋮          ⋮
19 498 │     1  -0.611168  -0.856527
20 499 │     1   0.491412  -3.09562
21 500 │     1   0.242016  -1.42652
22                   494 rows omitted
23

Source https://stackoverflow.com/questions/69790147

Community Discussions contain sources that include Stack Exchange Network

Tutorials and Learning Resources in Data Science

Tutorials and Learning Resources are not available at this moment for Data Science

Share this Page

share link

Get latest updates on Data Science