dvc.org | 📖 DVC website and documentation | Frontend Framework library
kandi X-RAY | dvc.org Summary
kandi X-RAY | dvc.org Summary
DVC project website's source code. Documentation and blog content. Contributions are welcome!.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of dvc.org
dvc.org Key Features
dvc.org Examples and Code Snippets
Community Discussions
Trending Discussions on dvc.org
QUESTION
Does anybody install DVC on MinIO storage?
I have read docs but not all clear for me.
Which command should I use for setup MinIO storage with this entrance parameters:
storage url: https://minio.mysite.com/minio/bucket-name/ login: my_login password: my_password
...ANSWER
Answered 2021-May-21 at 12:14Install
I usually use it as a Python package, int this case you need to install:
QUESTION
I have been storing my large files in CLOBs within Oracle, but I am thinking of storing my large files in a shared drive, then having a column in Oracle contain pointers to the files. This would use DVC.
When I do this,
(a) are the paths in Oracle paths that point to the files in my shared drive, as in, the actual files themselves?
(b) or do the paths in Oracle point somehow to the DVC metafile?
Any insight would help me out!
Thanks :) Justin
EDIT to provide more clarity:
I checked here (https://dvc.org/doc/api-reference/open), and it helped, but I'm not fully there yet ...
I want to pull a file from a remote dvc repository using python (which I have connected to the Oracle database). So, if we can make that work, I think I will be good. But, I am confused. If I specify 'remote' below, then how do I name the file (e.g., 'activity.log') when the remote files are all encoded?
...ANSWER
Answered 2021-Apr-03 at 19:28I'm not 100% sure that I understand the question (it would be great to expand it a bit on the actual use case you are trying to solve with this database), but I can share a few thoughts.
When we talk about DVC, I think you need to specify a few things to identify the file/directory:
- Git commit + path (actual path like
data/data/xml
). Commit (or to be precise any Git revision) is needed to identify the version of the data file. - Or path in the DVC storage (
/mnt/shared/storage/00/198493ef2343ao
...) + actual name of this file. This way you would be saving info that
.dvc` files have.
I would say that second way is not recommended since to some extent it's an implementation detail - how does DVC store files internally. The public interface to DVC organized data storage is its repository URL + commit + file name.
Edit (example):
QUESTION
I'm trying to setup DVC with Google Drive storage as shown here. So far, I've been unsuccessful in pushing data to the remote. I tried both with and without the Google App setup.
After running a dvc push -v
, the following exception is shown:
ANSWER
Answered 2021-Mar-05 at 06:56Can you try to install google-api-python-client==1.12.8
and test in that way?
Edit:
It appears to be that, this was a bug in the 2.0.0-2.0.1 of google-api-client and resolved in 2.0.2. So this should also work google-api-python-client>=2.0.2
QUESTION
I have a couple of projects that are using and updating the same data sources. I recently learned about dvc's data registries, which sound like a great way of versioning data across these different projects (e.g. scrapers, computational pipelines).
I have put all of the relevant data into data-registry
and then I imported the relevant files into the scraper project with:
ANSWER
Answered 2021-Mar-02 at 15:29When you import
(or add
) something into your project, a .dvc file is created with that lists that something (in this case the raw/
dir) as an "output".
DVC doesn't allow overlapping outputs among .dvc files or dvc.yaml stages, meaning that your "menu_items" stage shouldn't write to raw/
since it's already under the control of raw.dvc
.
Can you make a separate directory for the pipeline outputs? E.g. use processed/menu_items/restaurant.jsonl
QUESTION
Some package, such as DVC allow you to install extra dependencies to use additional features: to install a single extra dependency, whether by command line or in a requirements.txt
, you simply use brackets:
ANSWER
Answered 2021-Feb-11 at 09:44Of course, I found the solution right after posting, you just have to remove the space after the comma:
QUESTION
I use DVC to track my media files. I use MacOS and I want".DS_Store" files to be ignored by DVC. According to DVC documentation I can achieve it with .dvcignore. I created .dvcignore
file with ".DS_Store" rule. However every time ".DS_Store" is created dvc status
still says that content has changed
Here is the little test to reproduce my issue:
...ANSWER
Answered 2019-Jun-05 at 18:09The current implementation of .dvcignore
is very limited. Read more on it here.
Please, mention that you are interested in this feature here - https://github.com/iterative/dvc/issues/1876. That would help our team to prioritize issues properly.
The possible workaround for now would be to use one of these approaches - How to stop creating .DS_Store on Mac?
QUESTION
I dvc add
-ed a file I did not mean to add. I have not yet committed.
How do I undo this operation? In Git, you would do git rm --cached
.
To be clear: I want to make DVC forget about the file, and I want the file to remain untouched in my working tree. This is the opposite of what dvc remove
does.
One issue on the DVC issue tracker suggests that dvc unprotect
is the right command. But reading the manual page suggests otherwise.
Is this possible with DVC?
...ANSWER
Answered 2019-Sep-17 at 13:12As per mroutis on the DVC Discord server:
dvc unprotect
the file; this won't be necessary if you don't usesymlink
orhardlink
caching, but it can't hurt.- Remove the .dvc file
- If you need to delete the cache entry itself, run
dvc gc
, or look up the MD5 indata.dvc
and manually remove it from.dvc/cache
.
Edit -- there is now an issue on their Github page to add this to the manual: https://github.com/iterative/dvc.org/issues/625
QUESTION
According to this tutorial when I update file I should remove file from under DVC control first (i.e. execute dvc unprotect .dvc
or dvc remove .dvc
) and then add it again via dvc add
. However It's not clear if I should apply the same workflow for the directories.
I have the directory under DVC control with the following structure:
...ANSWER
Answered 2019-May-24 at 05:04Only when file is updated - i.e. edit 1.jpg
with your editor AND only if hadrlink or symlink cache type is enabled.
Please, check this link:
updating tracked files has to be carried out with caution to avoid data corruption when the DVC config option cache.type is set to hardlink or/and symlink
I would strongly recommend reading this document: Performance Optimization for Large Files it explains benefits of using hardlinks/symlinks.
QUESTION
I am following the tutorial about Data Version Control using mingw32
on Windows 7.
I am getting very strange error when I try to use run:
...ANSWER
Answered 2018-Oct-27 at 08:34I'm one of the dvc developers. Similar error has affected dvc running on cygwin. We've released a fix for it in 0.20.0
. Please upgrade.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install dvc.org
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page