syncfs | offers something between mounting a cloud storage system | Cloud Storage library
kandi X-RAY | syncfs Summary
kandi X-RAY | syncfs Summary
SyncFS is a Filesystem in Userspace (FUSE) that offers something between mounting a cloud storage system using FUSE while keeping all changes remotely, and syncing a Cloud drive locally. It was original designed to provide a cloud storage backend to ZBackup. It works by mirroring the files on the cloud (the remote) to the local filesystem, but only keeping the files local that are currently in use. If a file is opened that is not stored locally it will be downloaded to the local filesystem before the open system call returns. From that point on all filesystem operations are done locally. When the file is closed any changes will be uploaded to the cloud and the file will eventually be removed locally when it is no longer needed. The policy of when to keep a file locally and how long to wait until a file is uploaded can be configued on a per-file or per-directory basis.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of syncfs
syncfs Key Features
syncfs Examples and Code Snippets
Community Discussions
Trending Discussions on syncfs
QUESTION
I'm writing a WebExtension that uses C++ code compiled with emscripten. The WebExtension downloads files which I want to process inside the C++ code. I'm aware of the File System API and I think I read most of it, but I don't get it to work - making a downloaded file accessible in emscripten.
This is the relevant JavaScript part of my WebExtension:
...ANSWER
Answered 2020-Sep-19 at 07:47It turned out, that I was mixing some things up while trying different methods, and I was also misinterpreting my debugging tools. So the easiest way to accomplish this task (without using IDBFS) is:
JS:
QUESTION
I'm trying out the Emscriptens IndexedDB, but can't get it running. The file can't be loaded, "cannot open file". With EMSCRIPTEN_FETCH_LOAD_TO_MEMORY everthing works fine.
- Download file via emscripten's emscripten_fetch_t
- Save file directly in IndexedDB via EMSCRIPTEN_FETCH_PERSIST_FILE
- Load it later into memory
ANSWER
Answered 2020-Jun-10 at 12:47EMSCRIPTEN_FETCH_PERSIST_FILE
actually does two things:
- Check whether the file is already stored in the IndexDB. If yes, retrieve it from there and do not bother the server at all.
- If the file is not in the local IndexDB, download it from the server and cache there.
In particular, there is very little cache control: you can use EM_IDB_DELETE
to delete the cached version, but that's it.
So, it looks like you're not supposed to use IDBFS to access fetched files. Use the Fetch library instead and it will use the cached version without any network roundtrips.
It may also be useful to add -s FETCH_DEBUG
to the compilation line.
Prerequisite for understanding: read through IndexDB Basic Concepts to understand "database" and "object storage".
Looking into Emscripten's source code as well as the "Storage" tab in my brower's developer console:
- IDBFS mounted at
/data-foo-bar
will provide access to a database called/data-foo-bar
. Specifically, itsFILE_DATA
object storage (see here). Naming here is arbitrary and hard-coded. Filess store in the/data-foo-bar
folder are stored in keys like/data-foo-bar/my-file.txt
inside that object storage. - In particular, you cannot access arbitrary IndexDB databases or object storages.
- On the other hand, the Fetch library stores files to the
emscripten_filesystem
database, inside theFILES
object storage. Again, names look arbitrary and are hard-coded.
So, you don't get access to Fetch's cache through IDBFS simply because they access different storage objects in different databases with different naming conventions.
As an example, here is what FS.writeFile('/data/my-file.txt', 'hello')
results in in my Firefox:
And here is where the Fetch's cache lives:
Unfortunately, I don't know why does the content of http://localhost:8000/test.txt
is shown as an empty object.
QUESTION
What is the difference between fsync and syncfs ?
...ANSWER
Answered 2018-Jan-09 at 16:36First, fsync()
(and sync()
) are POSIX-standard functions while syncfs()
is Linux-only.
So availability is one big difference.
From the POSIX standard for fsync()
:
The
fsync()
function shall request that all data for the open file descriptor named byfildes
is to be transferred to the storage device associated with the file described byfildes
. The nature of the transfer is implementation-defined. Thefsync()
function shall not return until the system has completed that action or until an error is detected.
Note that it's just a request.
From the POSIX standard for sync()
:
The
sync()
function shall cause all information in memory that updates file systems to be scheduled for writing out to all file systems.The writing, although scheduled, is not necessarily complete upon return from
sync()
.
Again, that's not something guaranteed to happen.
The Linux man page for syncfs()
(and sync()
) states
sync()
causes all pending modifications to filesystem metadata and cached file data to be written to the underlying filesystems.
syncfs()
is likesync()
, but synchronizes just the filesystem containing file referred to by the open file descriptorfd
.
Note that when the function returns is unspecified.
The Linux man page for fsync()
states:
fsync()
transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptorfd
to the disk device (or other permanent storage device) so that all changed information can be retrieved even if the system crashes or is rebooted. This includes writing through or flushing a disk cache if present. The call blocks until the device reports that the transfer has completed.As well as flushing the file data,
fsync()
also flushes the metadata information associated with the file (see inode(7)).Calling
fsync()
does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicitfsync()
on a file descriptor for the directory is also needed.
Note that the guarantees Linux provides for fsync()
are much stronger than those provided for sync()
or syncfs()
, and by POSIX for both fsync()
and sync()
.
In summary:
- POSIX
fsync()
: "please write data for this file to disk" - POSIX
sync()
: "write all data to disk when you get around to it" - Linux
sync()
: "write all data to disk (when you get around to it?)" - Linux
syncfs()
: "write all data for the filesystem associated with this file to disk (when you get around to it?)" - Linux
fsync()
: "write all data and metadata for this file to disk, and don't return until you do"
Note that the Linux man page doesn't specify when sync()
and syncfs()
return.
QUESTION
Considering two functions that flush buffers:
fflush()
sync()
How can I know when a call to either one is needed?
I know adding a '\n'
to printf()
will flush the output buffer, but if the string contains no such character, when can I skip this call, and when not (multi-threaded systems?)?
Same goes for sync
. I have a function that saves files into the file system (the saving is done via a series of system calls), and it seems that without a call to sync
the files are not saved in a specific case
Unfortunately I don't have all of the details on that case currently [What I do know is that the files are saved and a power off occurs right after that (don't know exactly how soon) and the files are not there after reboot]. On all the tests I run, the files were saved correctly.
So, how can I figure out when the system will flush the file data/metadata buffers and when it will not and I'm needed to explicitly call sync()
?
Quoting man (which does not specify when an explicit call is needed):
...ANSWER
Answered 2018-Oct-02 at 05:32sync()
You seldom need a call to sync()
. When you do, you have something that it is crucial must be recorded on disk ASAP. However, sync()
will return having scheduled the writing of buffers in kernel memory, not after they've been written, so you won't know that they've actually been written — so it isn't wholly reliable. If you need more control over the writing for your file, look at the O_SYNC
, O_DSYNC
, O_RSYNC
flags for open()
. You will probably have to use fcntl()
and fileno()
to set these flags if you use file streams rather than file descriptors.
Two caveats:
sync()
won't write buffers from your process (or any other process) to the kernel buffer pool; it is wholly unrelated tofflush()
.sync()
affects all data written by all processes on the system — you can become unpopular if your application uses it very often; it subverts the good work the kernel does caching data.
fflush()
The fflush()
function ensures that data has been written to the kernel buffer pools from your application's buffer (either for a single file, or for all output files if you use fflush(0)
or fflush(NULL)
). It doesn't directly affect other processes. Again, you use this when you need to be confident that pending output has been sent to the kernel for onwards transmission. One place where you might use it is before an input operation where you want the prompt to appear, even if it hasn't got a newline at the end. Otherwise, you don't often use it, but you can use it whenever you want to be sure data has been sent to the kernel for writing. If you're debugging and your program crashes, a sprinkling of fflush()
statements can ensure that pending output is written before the crash. This can help reveal more accurately where the problem is (but so can using a debugger).
Note that setting unbuffered output (setbuf(stdout, NULL)
or setvbuf(stdout, NULL, _IONBF, 0)
) means all output occurs 'immediately'. This is not necessarily good for performance. You use it sometimes, but only fairly rarely.
QUESTION
I have a file with some data, which is also memory-mapped. So that I have both file descriptor and the pointer to the mapped pages. Mostly the data is only read from the mapping, but eventually it's also modified.
The modification consists of modifying some data within the file (sort of headers update), plus appending some new data (i.e. writing post the current end of the file).
This data structure is accessed from different threads, and to prevent collisions I synchronize access to it (mutex and friends).
During the modification I use both the file mapping and the file descriptor. Headers are updated implicitly by modifying the mapped memory, whereas the new data is written to the file by the appropriate API (WriteFile
on windows, write
on posix). Worth to note that the new data and the headers belong to different pages.
Since the modification changes the file size, the memory mapping is re-initialized after every such a modification. That is, it's unmapped, and then mapped again (with the new size).
I realize that writes to the mapped memory are "asynchronous" wrt file system, and order is not guaranteed, but I thought there was no problem because I explicitly close the file mapping, which should (IMHO) act as a sort of a flushing point.
Now this works without problem on windows, but on linux (android to be exact) eventually the mapped data turns-out to be inconsistent temporarily (i.e. data is ok when retrying). Seems like it doesn't reflect the newly-appended data.
Do I have to call some synchronization API to ensure the data if flushed properly? If so, which one should I use: sync
, msync
, syncfs
or something different?
Thanks in advance.
EDIT:
This is a pseudo-code that illustrates the scenario I'm dealing with. (The real code is more complex of course)
...ANSWER
Answered 2018-Aug-21 at 10:23Use addr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, fd, offset)
to map the file.
If the size of the file changes, use newaddr = mremap(addr, len, newlen, MREMAP_MAYMOVE)
to update the mapping to reflect it. To extend the file, use ftruncate(fd, newlen)
before remapping the file.
You can use mprotect(addr, len, protflags)
to change the protection (read/write) on any pages in the mapping (both must be aligned on a page boundary). You can also tell the kernel about your future accesses via madvise()
, if the mapping is too large to fit in memory at once, but the kernel seems pretty darned good at managing readahead etc. even without those.
When you make changes to the mapping, use msync(partaddr, partlen, MS_SYNC | MS_INVALIDATE)
or msync(partaddr, partlen, MS_ASYNC | MS_INVALIDATE)
to ensure the changes int partlen
chars from partaddr
forward are visible to other mappings and file readers. If you use MS_SYNC
, the call returns only when the update is complete. The MS_ASYNC
call tells the kernel to do the update, but won't wait until it is done. If there are no other memory maps of the file, the MS_INVALIDATE
does nothing; but if there are, that tells the kernel to ensure the changes are reflected in those too.
In Linux kernels since 2.6.19, MS_ASYNC
does nothing, as the kernel tracks the changes properly anyway (no msync()
is needed, except possibly before munmap()
). I don't know if Android kernels have patches that change that behaviour; I suspect not. It is still a good idea to keep them in the code, for portability across POSIXy systems.
mapped data turns-out to be inconsistent temporarily
Well, unless you do use msync(partaddr, partlen, MS_SYNC | MS_INVALIDATE)
, the kernel will do the update when it sees best.
So, if you need some changes to be visible to file readers before proceeding, use msync(areaptr, arealen, MS_SYNC | MS_INVALIDATE)
in the process doing those updates.
If you don't care about the exact moment, use msync(areaptr, arealen, MS_ASYNC | MS_INVALIDATE)
. It'll be a no-op on current Linux kernels, but it's a good idea to keep them for portability (perhaps commented out, if necessary for performance) and to remind developers about the (lack of) synchronization expectations.
As I commented to OP, I cannot observe the synchronization issues on Linux at all. (That does not mean it does not happen on Android, because Android kernels are derivatives of Linux kernels, not exactly the same.)
I do believe the msync()
call is not needed on Linux kernels since 2.6.19 at all, as long as the mapping uses flags MAP_SHARED | MAP_NORESERVE
, and the underlying file is not opened using the O_DIRECT
flag. The reason for this belief is that in this case, both mapping and file accesses should use the exact same page cache pages.
Here are two test programs, that can be used to explore this on Linux. First, a single-process test, test-single.c:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install syncfs
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page