TweetSets | Service for creating Twitter datasets for research | Dataset library
kandi X-RAY | TweetSets Summary
kandi X-RAY | TweetSets Summary
Twitter datasets for research and archiving. TweetSets allows users to (1) select from existing datasets; (2) limit the dataset by querying on keywords, hashtags, and other parameters; (3) generate and download dataset derivatives such as the list of tweet ids and mention nodes/edges.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Render a dataset
- Add a derivative to the database
- Get a connection to the database
- Add filenames
- Displays a limited dataset
- Add a new dataset
- Add a source dataset
- Write out all tweets
- Update current state
- Displays a list of datasets
- Extract mentions from a table
- Get the state of the tweet index
- Extract mentions from a dataframe
- Compute the partition for a given number of tweets
- Create Celery task
- Make a Spark DataFrame from a JSON file
- Fetch tweets by screen name
- Extract the number of tweets from a DataFrame
- Create dataset parameters
- Finds files in path
- Fetch tweets by a mention screen name
- Write all the mentions to disk
- Update the mentions table
- Extract columns from a DataFrame
- Concatenate json files into a single file
- Show statistics about datasets
TweetSets Key Features
TweetSets Examples and Code Snippets
Community Discussions
Trending Discussions on TweetSets
QUESTION
ANSWER
Answered 2017-Sep-03 at 21:08:load
copies the contents of a file into the REPL line by line. That means that you end up trying to define a package (which is not allowed in the REPL), and then you try to import things that aren't visible, etc. If you use :load
on a file that has a format useable by the REPL, it will work. In most cases, this means replacing the package
line(s) with import
s.
There's no need to use :load
anyway. sbt console
will place you in a REPL that has the project on its classpath. sbt consoleQuick
will place you in a REPL that only has the dependencies on the classpath.
For your second question, you are meant to use sbt
as a background process. In your terminal emulator, you'll have one tab running vim
on your files, and in the other tab, you'll have sbt
. In the tab with sbt
, you can run ~compile
, which recompiles your code every time you save a file in Vim. This replicates how IDEs show compiler errors/warnings as you type.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install TweetSets
Create data directories on a volume with adequate storage: mkdir -p /tweetset_data/redis mkdir -p /tweetset_data/datasets mkdir -p /tweetset_data/elasticsearch/esdata1 mkdir -p /tweetset_data/elasticsearch/esdata2 chown -R 1000:1000 /tweetset_data/elasticsearch
Create an esdata<number> directory for each ElasticSearch container.
On OS X, the redis and esdata<number> directories must be ugo+rwx.
Create a directory, to be named as you choose, where tweet data files will be stored for loading. mkdir /dataset_loading
Clone or download this repository: git clone https://github.com/gwu-libraries/TweetSets.git
Change to the docker directory: cd docker
Copy the example docker files: cp example.docker-compose.yml docker-compose.yml cp example.env .env
Edit .env. This file is annotated to help you select appropriate values.
Create dataset_list_msg.txt in the docker directory. The contents of this file will be displayed on the dataset list page. It can be used to list other datasets that are available, but not yet loaded. If leaving the file empty then: touch dataset_list_msg.txt
Bring up the containers: docker-compose up -d
Clusters must have at least a primary node and two additional nodes. For HTTPS support, uncomment and configure the nginx-proxy container in docker-compose.yml.
Create data directories on a volume with adequate storage: mkdir -p /tweetset_data/redis mkdir -p /tweetset_data/datasets mkdir -p /tweetset_data/full_datasets mkdir -p /tweetset_data/elasticsearch chown -R 1000:1000 /tweetset_data/elasticsearch
Create a directory, to be named as you choose, where tweet data files will be stored for loading. mkdir /dataset_loading
Clone or download this repository: git clone https://github.com/gwu-libraries/TweetSets.git
Change to the docker directory: cd docker
Copy the example docker files: cp example.cluster-primary.docker-compose.yml docker-compose.yml cp example.env .env
Edit .env. This file is annotated to help you select appropriate values.
Create dataset_list_msg.txt in the docker directory. The contents of this file will be displayed on the dataset list page. It can be used to list other datasets that are available, but not yet loaded. If leaving the file empty then: touch dataset_list_msg.txt
Bring up the containers: docker-compose up -d
Create data directories on a volume with adequate storage: mkdir -p /tweetset_data/elasticsearch chown -R 1000:1000 /tweetset_data/elasticsearch
Clone or download this repository: git clone https://github.com/gwu-libraries/TweetSets.git
Change to the docker directory: cd docker
Copy the example docker files: cp example.cluster-node.docker-compose.yml docker-compose.yml cp example.cluster-node.env .env
Edit .env. This file is annotated to help you select appropriate values. Note that 2 cluster nodes must have MASTER set to true.
Bring up the containers: docker-compose up -d
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page