datapusher | standalone web service that pushes data files | REST library

 by   ckan Python Version: 0.0.17 License: AGPL-3.0

kandi X-RAY | datapusher Summary

kandi X-RAY | datapusher Summary

datapusher is a Python library typically used in Web Services, REST, Docker applications. datapusher has no bugs, it has no vulnerabilities, it has build file available, it has a Strong Copyleft License and it has low support. You can install using 'pip install datapusher' or download it from GitHub, PyPI.

DataPusher is a standalone web service that automatically downloads any tabular data files like CSV or Excel from a CKAN site's resources when they are added to the CKAN site, parses them to pull out the actual data, then uses the DataStore API to push the data into the CKAN site's DataStore. This makes the data from the resource files available via CKAN's DataStore API. In particular, many of CKAN's data preview and visualization plugins will only work (or will work much better) with files whose contents are in the DataStore.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              datapusher has a low active ecosystem.
              It has 54 star(s) with 145 fork(s). There are 21 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 80 open issues and 61 have been closed. On average issues are closed in 251 days. There are 22 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of datapusher is 0.0.17

            kandi-Quality Quality

              datapusher has 0 bugs and 23 code smells.

            kandi-Security Security

              datapusher has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              datapusher code analysis shows 0 unresolved vulnerabilities.
              There are 68 security hotspots that need review.

            kandi-License License

              datapusher is licensed under the AGPL-3.0 License. This license is Strong Copyleft.
              Strong Copyleft licenses enforce sharing, and you can use them when creating open source projects.

            kandi-Reuse Reuse

              datapusher releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              Installation instructions, examples and code snippets are available.
              datapusher saves you 482 person hours of effort in developing the same functionality from scratch.
              It has 1135 lines of code, 66 functions and 13 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed datapusher and discovered the below as its top functions. This is intended to give you an instant insight into datapusher implemented functionality, and help decide if they suit your requirements.
            • Push data to a datastore
            • Check HTTP response status code
            • Check if a datastore resource exists
            • Chunk items into chunks
            • Send data to Datastore
            • Update a resource
            • Delete a resource
            • Get a CAN resource
            • Validate input
            • Get URL for ckan action
            • Start the web server
            Get all kandi verified functions for this library.

            datapusher Key Features

            No Key Features are available at this moment for datapusher.

            datapusher Examples and Code Snippets

            DataPusher,Production deployment
            Pythondot img1Lines of Code : 26dot img1License : Strong Copyleft (AGPL-3.0)
            copy iconCopy
             # Install requirements for the DataPusher
             sudo apt install python3-venv python3-dev build-essential
             sudo apt-get install python-dev python-virtualenv build-essential libxslt1-dev libxml2-dev git libffi-dev
            
             # Create a virtualenv for datapusher
             s  
            DataPusher,Development installation
            Pythondot img2Lines of Code : 9dot img2License : Strong Copyleft (AGPL-3.0)
            copy iconCopy
            sudo apt-get install python-dev python-virtualenv build-essential libxslt1-dev libxml2-dev zlib1g-dev git libffi-dev
            
            git clone https://github.com/ckan/datapusher
            cd datapusher
            
            pip install -r requirements.txt
            pip install -r requirements-dev.txt
            pip   
            DataPusher,Usage,Command line
            Pythondot img3Lines of Code : 4dot img3License : Strong Copyleft (AGPL-3.0)
            copy iconCopy
            ckan -c /etc/ckan/default/ckan.ini datapusher resubmit
            
            paster --plugin=ckan datapusher resubmit -c /etc/ckan/default/ckan.ini
            
            ckan -c /etc/ckan/default/ckan.ini datapusher submit {dataset_id}
            
            paster --plugin=ckan datapusher submit  -c /etc/ckan/de  

            Community Discussions

            QUESTION

            CKAN: how do I update/create the data dictionary of a resource using the api?
            Asked 2021-Mar-22 at 14:19

            My company is using a CKAN instance configured with Data Store and DataPusher. When a CSV file is uploaded to CKAN, DataPusher sends it to the DataStore and creates a default Data Dictionary for the resource. The Data Dictionary is a very nice feature to display the description of data fields for the users. Here is an example:

            I can update the Data Dictionary using the UI, or it can be sent as part of the Fields passed to datastore_create().

            My problem, is that I don't control the call of datastore_create() because this method is automatically called buy the DataPusher service.

            I want to programatically set the values of the Data Dictionary, but I can't find and api call that allow me to do it. An api call that update the Fields metadata. Can I do it using the Api? Or maybe it is possible create it when I create the data resource. I'd like a code example.

            ...

            ANSWER

            Answered 2021-Mar-22 at 14:19

            You can use the API call datastore_create on top of an existing table. This will not impact the data in the table.

            You should use the datastore_search to check the format of how the dictionary is saved in one of your resources (result->fields->info). Use that as your base, make the desired changes, and use it in the body of the datastore_create call.

            Unfortunately, the API call datastore_info does not give you back that information.

            The majority of the CKAN UI functionalities can be made through the API as well. In this case, you can make use of the "datastore_create" by the controller --> See Code here.

            Source https://stackoverflow.com/questions/66640439

            QUESTION

            How do I run a CKAN docker image?
            Asked 2020-Oct-15 at 17:28

            I have been trying for several days now to run CKAN as a docker image. The official CKAN documentation explains in detail how to create your own docker image via "docker-compose". The basic workflow is:

            1. Clone the CKAN source files from GitHub
            2. Make some changes in the "docker-compose.yml" file (custom passwords, extensions, etc)
            3. Run "docker-compose"

            This gives you a running CKAN docker container together with all the necessary databases and search engines.

            My ultimate goal however is to push this CKAN docker image to Docker-Hub and to run it on other machines via "docker run". I want to use this apporach because I want to extensively modify the original CKAN installation, and also add custom datasets, groups and organizations to the running catalog before pushing it to docker hub. "Docker run" seems to be much easier and more convenient compared to using "Docker compose".

            The problem is: Whenever I try to run the CKAN container using the commands below I end up with the following error:

            ...

            ANSWER

            Answered 2020-Oct-15 at 17:28

            run it as a docker compose command if you want to run it from docker you need to split away docker files and send some arguments from .env file like error says SQL_ALCHEMY is not being sent.

            i would recommend to execute it as docker compose. then once you build from docker compose you can just execute docker run and send missing arguments like SQL_ALCHEMY.

            Source https://stackoverflow.com/questions/64005263

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install datapusher

            Install the required packages::.
            The default DataPusher configuration uses SQLite as the backend for the jobs database and a single uWSGI thread. To increase performance and concurrency you can configure DataPusher in the following way:.
            Use Postgres as database backend, which will allow concurrent writes (and provide a more reliable backend anyway). To use Postgres, create a user and a database and update the SQLALCHEMY_DATABASE_URI settting accordingly: # This assumes DataPusher is already installed sudo apt-get install postgresql libpq-dev sudo -u postgres createuser -S -D -R -P datapusher_jobs sudo -u postgres createdb -O datapusher_jobs datapusher_jobs -E utf-8 # Run this in the virtualenv where DataPusher is installed pip install psycopg2 # Edit SQLALCHEMY_DATABASE_URI in datapusher_settings.py accordingly # eg SQLALCHEMY_DATABASE_URI=postgresql://datapusher_jobs:YOURPASSWORD@localhost/datapusher_jobs
            Start more uWSGI threads. On the deployment/datapusher-uwsgi.ini file, set workers and threads to a value that suits your needs, and add the lazy-apps=true setting to avoid concurrency issues with SQLAlchemy, eg: # ... rest of datapusher-uwsgi.ini workers = 3 threads = 3 lazy-apps = true

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install datapusher

          • CLONE
          • HTTPS

            https://github.com/ckan/datapusher.git

          • CLI

            gh repo clone ckan/datapusher

          • sshUrl

            git@github.com:ckan/datapusher.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link