crlf | handling CR/LF line endings in Go | Command Line Interface library
kandi X-RAY | crlf Summary
kandi X-RAY | crlf Summary
The crlf package helps in dealing with files that have DOS-style CR/LF line endings.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Transform applies \ n \ r to dst .
- Fuzz fuzzes data
- Open opens a file named by name .
- Create returns an io . WriteCloser with the given name .
- NewReader returns a new Reader reading from r .
- NewWriter returns a new io . Writer
crlf Key Features
crlf Examples and Code Snippets
Community Discussions
Trending Discussions on crlf
QUESTION
I'm dealing with LF/CRLF issues in a git repository and reading git's documentation to try to understand what I need to do.
One part of the documentation is confusing to me: they write here:
...many editors on Windows silently replace existing LF-style line endings with CRLF, or insert both line-ending characters when the user hits the enter key. Git can handle this by auto-converting CRLF line endings into LF when you add a file to the index, and vice versa when it checks out code onto your filesystem. You can turn on this functionality with the
core.autocrlf
setting. If you’re on a Windows machine, set it totrue
— this converts LF endings into CRLF when you check out code.
What I don't understand is: if I'm using Windows, why would I care to convert line endings from LF to CRLF? Would it be because my editor doesn't recognize LF line endings and thus shows all the code in a file as being one line? If it's that, then it seems that if I'm using an editor that does recognize such LF line endings and shows the code correctly even when the file is using LF line endings, then I wouldn't need to do the LF-to-CRLF conversion, right?
...ANSWER
Answered 2022-Apr-09 at 22:58If your editor supports LF and you don't care about other people that might want to contribute to your repository then yes, for the most part, you don't need the conversion.
Any decent code editor from the last 20 years should handle LF files. Notepad on the other hand only supported CRLF for a very long time. Fixed in Windows 10 1809.
There still might be some command line tools that choke on LF and perhaps the biggest issue; command line tools that fopen
files in text mode using the Microsoft C run-time will output CRLF even when \n
is used in their code.
In the end I suppose it is a matter of preference and where you want potential conversion errors to occur; in the git auto-conversion or in tools used to parse/process the files.
QUESTION
A build script I wrote is failing on a ci/cd pipeline (that runs in linux) because somehow the build.sh script got converted/save in CRLF format (based on what i gather online), leading to this error:
...ANSWER
Answered 2021-Oct-02 at 21:17Here are a few simple rules, although some of them are opinions:
core.eol
is not needed; don't bother with it.core.autocrlf
should always befalse
.- If you have naïve Windows users who will edit
*.sh
files on a Windows system and thereby insert CRLF line endings into them, use.gitattributes
to correct this.
In the .gitattributes
file, list the .sh
files in question, or *.sh
, along with the directives text eol=lf
. List any other files that need special consideration too, while you're at it: *.jpg
can have a binary
directive, if you have JPG images in the repository; *.bat
can be marked text eol=crlf
; and so on.
This won't fix your existing problem; to do that, clone the repository, check out the bad commit at the tip of the current branch, modify the .sh
file(s) to replace the existing CRLF line endings with LF-only line endings, and add and commit these files. (You can do this in the same commit in which you create the .gitattributes
file.) If you have a reasonably modern Git, creating the .gitattributes
file and then running git add --renormalize build.sh
is supposed to do all of that (except the "create a new commit" step of course) in one fell swoop (or swell foop, if you're fond of Spoonerisms).
Line-ending-fiddling in Git is an endless source of confusion. Part of the problem stems from the fact that people attempt to observe what's happening by inspecting the files in their working tree. That's akin to trying to figure out why the icemaker in your freezer isn't working by taking the trays out and putting them under extremely hot and bright lights, so that the plastic trays melt. If you do this, you are:
- looking in the wrong place, and
- using a tool that destroys the very information you might be looking for in the first place.
That is, the problem is elsewhere, and by the time you get around to looking for it, it's long gone.
To understand what's going on, and hence how and why anything that fixes the problem actually fixes the problem, you must first learn the Three Places Of Git where files can be found:
Files are stored, permanently1 and immutably, inside commits, in a special, read-only, Git-only, compressed and de-duplicated form. Each commit acts as an archive—kind of like a tar or zip archive—of every file as of the state that file had at the time you committed it.
Because of the special properties of these files, they literally cannot be used by your computer, except by Git itself. They must therefore be extracted, like un-archiving an archive with
tar -x
orunzip
.Files are stored in a usable form, as everyday files, in your working tree. This is where the extracted (unzipped, or whatever) files wind up. These files are not actually in Git at all. They are there for you to use as inputs and/or outputs, and your working tree is just an ordinary set of folders (or directories, whichever term you prefer) and files, stored in the way that is ordinary for your particular computer.2
That covers two places: so where is this "third place" I talk about? This is what Git calls, variously, the index, or the staging area, or—rarely these days—the cache. Git's index holds a third "copy" of every file. I put the word "copy" in quotes here because what's in the index is actually a sort of reference, using the de-duplication trick.
Initially, when you first use git checkout
or git switch
to extract a particular commit from a repository you've just cloned, what Git does is:
- "copy" each file into its own index: this "copy" is in the read-only Git-only compressed-and-de-duplicated form; then
- expand the file into usable form and put that into your working tree.
Note that before this step, Git's index was empty: it had no files in it at all. Now Git's index has every file from the current commit. These take no space, because they're de-duplicated and—having come out of a commit—they're all already in the repository so they are duplicates and therefore these copies use no space to hold the data.3
So: what's the point of this index / staging-area / cache? Well, one point is that it makes Git go fast. Another is that it lets you partially stage files (though I won't cover what that means here). But in fact, it's not strictly necessary: other version control systems get away without having one. It's just that Git not only has it, Git forces you to use it. So you need to know about it, if only to know that it places itself between you and your files—in your working tree—and the commits in the repository.
By omitting a few details that eventually matter, but not yet, we can describe the index pretty well as your proposed next commit. That is, the index holds each file that will go into the next commit. These files are in Git's own format—compressed and de-duplicated—but, unlike the files inside a commit, you can replace them. You can't modify them (they're in the read-only format, and pre-de-duplicated), but you can run git add
.
The git add
command reads the working tree copy of some file. This working tree copy is the version you see and work with. If you've changed it, git add
reads the changed version.4 The add
command compresses this data down into Git's special internal format and checks to see if it's a duplicate. If it is a duplicate, Git throws out its compression result and re-uses the existing data, updating the index with the re-used file. If it's not a duplicate, Git saves the compressed and de-duplicated (but first time now) file data and updates the index with that.
Either way, what's in the index now is the updated file. So the index now holds your proposed next commit. It held your proposed next commit before the git add
too, but now your proposed next commit is updated. This tells us what the index is for from our point of view: The index holds your proposed next commit. You do not commit what is in your working tree. Instead, you commit what is in Git's index. This is why you need to know about the index: it's how Git makes new commits.
1The commits themselves are only permanent until you or Git remove them, but in a lot of cases that's "never". It's actually kind of hard to get rid of a Git commit, for many reasons. A file's data as stored in a commit, de-duplicated, remains in the repository until every commit that holds that file is removed, though.
2The actual file storage format inside computers is itself amazingly complicated and varied. Some systems do case-preserving but case-folding in file names, for instance, so that README.md
and ReadMe.md
are "the same file", while others say that these are two different files. Git holds the latter opinion, and when the commit archive holds both a README.md
and a ReadMe.md
, and you extract that commit to your working tree, one of those files goes missing from your working tree, since it's physically incapable of holding both (because they have the "same name" as far as your computer is concerned). Because Git's archived files are in a special Git-only format, this is not a problem for Git itself. But it can be a huge headache for you.
3The other properties stored in the index—such as the cache aspect, which helps Git go fast—do take a bit of space. The average tends to somewhere close to 100 bytes per file, so unless you have a million files (which then needs ~100 MB of index), this is utterly trivial in modern systems where a chip the size of your fingernail provides 256 GB of storage.
4If you haven't changed it, git add
tries to skip reading it, to make Git go fast. This will cause us problems in a moment. So sometimes you may find it useful to trick Git into thinking you've changed it. You can do this by rewriting the file in place, or using the touch
command if you have that, for instance. The --renormalize
flag to git add
is supposed to fix this as well, but I have seen people say it doesn't.
Let's review quickly now:
Every commit contains files-as-a-snapshot, in a frozen (read-only), compressed, de-duplicated format. Nothing, not even Git itself, can ever change any part of any commit.
Git makes new commits from whatever is in Git's index. Git fills in the index from a commit when you check out the commit, and builds the new commit from whatever is in its index at the time you run
git commit
.Your working tree lets you see what came out of a commit: the files come out of the commit, go into Git's index, and then get copied and expanded to become ordinary files in your working tree. Your working tree lets you control what goes into a new commit: you run
git add
and the data get compressed, de-duplicated, and generally Git-ified and put into the index, ready to be committed.
Note that there are steps here where Git does something very easy for Git: copying a commit into the index doesn't change any of the files at all, as they're still in the special read-only, Git-only format. Making a new commit doesn't change any of the files at all: they just get packaged up into a (read-only) commit, from the (replaceable but still read-only) "copies" in the index. But there are two steps where Git does something much harder:
As a file gets copied out of the index to your working tree, it gets expanded and transformed. Git has to change from compressed bytes to uncompressed bytes. This is an ideal time to change LF-only to CRLF and this is when Git will do that, if Git does it at all.
As a file gets copied from the working tree to be compressed and Git-ified and checked if it's a duplicate, Git has to change from uncompressed bytes to compressed ones. This is an ideal time to change CRLF to LF-only and this is when Git will do that, if Git does it at all.
So it's copies in and out of the index where Git does CRLF line ending modification. Moreover, the "index -> working tree" step—which happens during git checkout
, for instance—can only add CRs. It can't remove them. The "working tree -> index" step—which happens during git add
, for instance—can only remove CRs, not add them.
This in turn means that, if you choose to start doing line ending transformation, the committed files inside the repository will eventually end up with LF-only line endings, over time. If some committed files have CRLF line endings now, they will, in those commits, have those endings forever, because no existing commit can be changed.
Optimizations that get in the wayNow we get to some of the optimizations:
When checking out a commit, Git tries hard not to touch the working tree if possible. This is slow! Let's not do it if we don't have to.
When using
git add
, Git tries hard not to touch the index if possible. It's too slow!
Suppose you check out some commit, say, deadbeef
. It has 5923 files in it. Those files get "copied" into the index, which is really fast because these aren't real copies. But were there files in the index before? Say you had commit dadc0ffee
out just before you switched to deadbeef
. That commit had put 5752 files in the index, and then all you did was look at the working tree copies.
Obviously these files aren't all the same, but what if 5519 of the files were the same, leaving only 233 files to change and 171 new files to create. For whatever reason, there are no files in dadc0ffee
that aren't in deadbeef
, there are only new files. Or maybe one file does go away and Git will have to remove that one from the working tree and create 172 files. But either way, Git only needs to mess with 404 or 405 files in the working tree, not more than 5500. That's going to run about ten times faster.
So, Git does that. If Git can, it doesn't touch files. It assumes that if file path/to/file.ext
in the index in commit dadc0ffee
has the same raw hash ID as file path/to/file.ext
in the index in commit deadbeef
, it does not have to do anything to the working tree copy.
This assumption breaks down in the presence of CRLF line ending trickiness. If Git is supposed to do LF to CRLF modifications on the way out, but didn't for dadc0ffee
, Git may skip doing it for deadbeef
too.
What this means is that whenever you change the CRLF line endings settings, you can end up with "wrong" line endings in your working tree. You can get around this by removing the working tree copy and then checking out the file again (with git restore
or git reset --hard
, for instance, though remember that git reset --hard
loses uncommitted work!).
Meanwhile, if you run git add
on some file, and Git thinks that the cached index copy is up to date—because you haven't edited the working tree copy, for instance—Git will silently do nothing at all. But if the working tree copy has CRLF line endings, and the index (and hence future commit) copy shouldn't, this is wrong. Using git add --renormalize
is supposed to get around it, or you can "touch" the file so that Git sees a newer working-tree time stamp and will redo the copy. Or, you can even run git rm --cached
on the file, and then git add
really does have to copy it, because there's no longer a copy of that file in the index at all.
Using a .gitattributes
file entry gives Git the most chance to get things right: Git can tell if the .gitattributes
file entry affects some particular file. That gives Git the opportunity to do better cache checking, for instance. (Git currently doesn't use this opportunity properly, I think, but at least it offers the possibility.)
When you do use .gitattributes
entries, they tell Git multiple things:
- this file definitely is or isn't text: do, or don't, mess with it;
- if you are going to mess with line endings, here's what to do.
This lets you say that *.bat
files need to be CRLF-ended in the working tree, even on a Linux system, and *.sh
files need to be LF-ended in the working tree, even on a Windows system.
You get as much control as Git is willing to give you:
- You get the ability to turn CRLF in the working tree into LF-only in the index and hence in future commits.
- You get the ability to turn LF-only in committed copies of files into CRLF in the working tree, on future extractions of this commit.
The one thing you lose is the easy and global effect of core.eol
and core.autocrlf
: these affect existing commits, and tell Git to guess whether each file is text. As long as Git guesses right, that tends to work sort-of-OK. It's when Git guesses wrong that things go really bad. But because these settings affect every file extraction (index-to-work-tree) and every git add
(work-tree-to-index) that actually happens, and it's hard to know which ones happen, it's very hard to see what's going on.
QUESTION
When I create a new class using Visual Studio, it produces the following:
...ANSWER
Answered 2022-Mar-25 at 10:59This is a known issue in Visual Studio. It has been fixed, and the fix should be included in the next Visual Studio release.
QUESTION
I noticed that when I have, in my git repository, a file with CRLF line endings, and I try to create and apply a patch which changes this file, it fails.
This is a simple way to reproduce the problem:
...ANSWER
Answered 2022-Mar-11 at 01:08TL;DR: use git am --keep-cr
.
The patch itself, in 0001-second-commit.patch
, says, in effect: expect the last line to read bar
plus a carriage return; add after that another line, also with a carriage return, but the "mail splitting" process that git am
uses on a mailbox removes both carriage returns. Hence the internal git apply
step that git am
runs at this point effectively goes: Whoa, wait, hold on a minute! The original text here doesn't match! I'd better fail and make the human help out.
To fix this, tell git am
to invoke the mail splitting process differently. By default, git mailsplit
treats carriage returns at ends of lines as mistakes made by some email software, and removes them. The --keep-cr
option tells git mailsplit
that no, it's not a mistake: please keep those carriage turns. This option is also offered by git am
itself, and git am
(but not git mailsplit
) has a configuration knob to turn it on by default: am.keepcr
can be set to true
.
If you've configured am.keepcr
to true
and want to override it temporarily, git am
has --no-keep-cr
as well.
QUESTION
I have four text files in the following directory that have varying EOL characters:
C:\Sandbox 1.txt, 2.txt, 3.txt, 4.txt
I would like to write a powershell script that will loop through all files in the directory and find the EOL characters that are being used for each file and print them into a new file named EOL.txt
Sample contents of EOL.txt:
...ANSWER
Answered 2022-Mar-04 at 19:44Try the following:
QUESTION
Im communicating with physical equipments via python socket. Im kinda used to doing it, but not an expert. The commands which i send to the equipments are usually found in the equipment's manuals. Usually, commands end with CR or LF (or both) and work well when i do something like:
mysocket.send((command+"\r\n").encode())
With "\r\n" acting as CRLF (although as this answer says, CRLF and \r\n are not exactely the same thing)
I found an equipment which needs to end commands with ENQUIRY char which, according to the manual, has hexadecimal code equal to 05 in the ASCII table (the manual also says the hexadecimal code of CR and LF which are 0D and 0A, so i think its correct).
Although i have looked into the ASCII table, it has no char related to this or even CR and LF. Is there a different table for chars used with these meanings in different languages? What should i put inside an encoded string to match the meaning in python?
...ANSWER
Answered 2022-Feb-25 at 22:53You can use
QUESTION
Despite having a global gitignore on Windows OS, I am still getting a bunch of "Untracked files".
Below is a truncated list of what it looks like:
...ANSWER
Answered 2022-Jan-25 at 00:42Your problem is that your ignore file is in UTF-16. While UTF-16 is common on Windows, it is practically unused elsewhere outside of some programming languages, and UTF-8 has mostly supplanted it. Git accepts ignore files only in UTF-8 or other ASCII-compatible encodings (in the event your patterns contain non-UTF-8 characters), so you'll need to change your file to be in the proper format for it to work.
I would also recommend using LF endings, although I don't believe those are absolutely required for it to work.
QUESTION
When ending a letter to Outlook mail from an Oracle DBMS that contains Cyrillic characters, the output is question marks. I do not understand how to convert the string so that the text is readable. I ship using the following method:
...ANSWER
Answered 2022-Jan-21 at 13:22You need to add lines like this:
QUESTION
I'm using Prettier In my TypeScript project. I format all files on save in Visual Studio Code. I also configured a pre-commit git hook with Husky which checks the formatting of all files.
This is how my pre-commit
hook file looks like:
ANSWER
Answered 2022-Jan-20 at 06:47After digging the issue more and thanks to comments from @eDonkey and @torek, I found a solution. I tested it for 2 days and it seems to work.
Please note, that this solution probably works for a project with Windows-only developers. If all of you are on Mac/Linux, you'd probably want to use LF instead of CRLF. If the team is mixed, I'm not sure there's a perfect solution here.
First, configure git
to not do CRLF -> LF conversion:
QUESTION
I was reading about the .gitattributes
file and the rule to force line endings in some tutorials it's written like * text=auto
and in some others, it's like * text=auto eol=lf
at the first line of the file.
Are there any differences? what does the first one exactly do? Does it even force any line endings?
Also in some repositories it's mentioned that * text=auto
preforms LF normalization! I don't know whether it's true or not.
ANSWER
Answered 2022-Jan-10 at 23:27There's a difference between these attributes. text
asks Git to perform line ending conversion. Any time Git does this, it will store LF endings in the repository, and it will convert them when it checks files out in the working tree. text=auto
asks Git to search the beginning of the file for a NUL byte, and if it finds one, then the file is binary and conversions are not performed; otherwise, the file is text, and conversions are performed. This usually works fine in most cases, and is a sensible default.
By default, Git honors several configuration variables to decide what line ending conversion should be used in the working tree (LF or CRLF), unless the eol
attribute is set. If eol
is set, then (a) the file is automatically set to be text
and (b) that line ending is always used.
So in the former case, * text=auto
says, "Guess whether this is a text file, and if it is, check this file out with the user's preferred line endings." The eol=lf
applies only to files that are guessed as text
in this case, as of Git 2.10. In general, eol
applies if text
is set explicitly, text=auto
is set and the file is detected as text, or if text
is left unspecified; in Git 2.10 and newer, it doesn't affect files explicitly marked -text
or detected as binary with text=auto
.
However, if you're using older versions of Git, this can cause some binary files to be mishandled, since it will force them to always be text. If your repository contains only text files, then it will work, but this is better written as * text eol=lf
. Otherwise, you can specify different types of files separately:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install crlf
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page