crlf | handling CR/LF line endings in Go | Command Line Interface library

by andybalholm Go Version: Current License: BSD-2-Clause

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | crlf Summary

crlf is a Go library typically used in Utilities, Command Line Interface applications. crlf has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. You can download it from GitHub.

The crlf package helps in dealing with files that have DOS-style CR/LF line endings.

Support

Quality

Security

License

Reuse

Support

crlf has a low active ecosystem.

It has 23 star(s) with 2 fork(s). There are 2 watchers for this library.

It had no major release in the last 6 months.

There are 0 open issues and 1 have been closed. There are no pull requests.

It has a neutral sentiment in the developer community.

The latest version of crlf is current.

Quality

crlf has 0 bugs and 0 code smells.

Security

crlf has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

crlf code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

crlf is licensed under the BSD-2-Clause License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

crlf releases are not available. You will need to build from source code and install.

Installation instructions are not available. Examples and code snippets are available.

It has 177 lines of code, 13 functions and 5 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi has reviewed crlf and discovered the below as its top functions. This is intended to give you an instant insight into crlf implemented functionality, and help decide if they suit your requirements.

Transform applies \ n \ r to dst .
Fuzz fuzzes data
Open opens a file named by name .
Create returns an io . WriteCloser with the given name .
NewReader returns a new Reader reading from r .
NewWriter returns a new io . Writer

Get all kandi verified functions for this library.

crlf Key Features

No Key Features are available at this moment for crlf.

crlf Examples and Code Snippets

No Code Snippets are available at this moment for crlf.

Community Discussions

Trending Discussions on crlf

Why would a Windows user want to convert LF endings to CRLF?

/bin/sh^M: bad interpreter: No such file or directory error caused by different GIT env configs overriding each other?

Visual Studio doesn't put space in between 'internalclass' due to EditorConfig

why I can't apply git patch when a file uses CRLF?

Is there a Powershell command that will print a specified file's EOL characters?

Which is the character in python?

Windows global gitignore not working... "Untracked files"

Cyrillic characters are distorted when sending a letter

Prettier on git commit hook shows code style issues, but only CRLF differences

What is the difference between "* text=auto" and "* text=auto eol=lf"?

QUESTION

Why would a Windows user want to convert LF endings to CRLF?

Asked 2022-Apr-09 at 23:00

I'm dealing with LF/CRLF issues in a git repository and reading git's documentation to try to understand what I need to do.

One part of the documentation is confusing to me: they write here:

...many editors on Windows silently replace existing LF-style line endings with CRLF, or insert both line-ending characters when the user hits the enter key. Git can handle this by auto-converting CRLF line endings into LF when you add a file to the index, and vice versa when it checks out code onto your filesystem. You can turn on this functionality with the core.autocrlf setting. If you’re on a Windows machine, set it to true — this converts LF endings into CRLF when you check out code.

What I don't understand is: if I'm using Windows, why would I care to convert line endings from LF to CRLF? Would it be because my editor doesn't recognize LF line endings and thus shows all the code in a file as being one line? If it's that, then it seems that if I'm using an editor that does recognize such LF line endings and shows the code correctly even when the file is using LF line endings, then I wouldn't need to do the LF-to-CRLF conversion, right?

...

ANSWER

Answered 2022-Apr-09 at 22:58

If your editor supports LF and you don't care about other people that might want to contribute to your repository then yes, for the most part, you don't need the conversion.

Any decent code editor from the last 20 years should handle LF files. Notepad on the other hand only supported CRLF for a very long time. Fixed in Windows 10 1809.

There still might be some command line tools that choke on LF and perhaps the biggest issue; command line tools that fopen files in text mode using the Microsoft C run-time will output CRLF even when \n is used in their code.

In the end I suppose it is a matter of preference and where you want potential conversion errors to occur; in the git auto-conversion or in tools used to parse/process the files.

Source https://stackoverflow.com/questions/71812378

QUESTION

/bin/sh^M: bad interpreter: No such file or directory error caused by different GIT env configs overriding each other?

Asked 2022-Apr-04 at 07:05

A build script I wrote is failing on a ci/cd pipeline (that runs in linux) because somehow the build.sh script got converted/save in CRLF format (based on what i gather online), leading to this error:

...

ANSWER

Answered 2021-Oct-02 at 21:17

Here are a few simple rules, although some of them are opinions:

core.eol is not needed; don't bother with it.
core.autocrlf should always be false.
If you have naïve Windows users who will edit *.sh files on a Windows system and thereby insert CRLF line endings into them, use .gitattributes to correct this.

In the .gitattributes file, list the .sh files in question, or *.sh, along with the directives text eol=lf. List any other files that need special consideration too, while you're at it: *.jpg can have a binary directive, if you have JPG images in the repository; *.bat can be marked text eol=crlf; and so on.

This won't fix your existing problem; to do that, clone the repository, check out the bad commit at the tip of the current branch, modify the .sh file(s) to replace the existing CRLF line endings with LF-only line endings, and add and commit these files. (You can do this in the same commit in which you create the .gitattributes file.) If you have a reasonably modern Git, creating the .gitattributes file and then running git add --renormalize build.sh is supposed to do all of that (except the "create a new commit" step of course) in one fell swoop (or swell foop, if you're fond of Spoonerisms).

What's going on here?

Line-ending-fiddling in Git is an endless source of confusion. Part of the problem stems from the fact that people attempt to observe what's happening by inspecting the files in their working tree. That's akin to trying to figure out why the icemaker in your freezer isn't working by taking the trays out and putting them under extremely hot and bright lights, so that the plastic trays melt. If you do this, you are:

looking in the wrong place, and
using a tool that destroys the very information you might be looking for in the first place.

That is, the problem is elsewhere, and by the time you get around to looking for it, it's long gone.

To understand what's going on, and hence how and why anything that fixes the problem actually fixes the problem, you must first learn the Three Places Of Git where files can be found:

Files are stored, permanently¹ and immutably, inside commits, in a special, read-only, Git-only, compressed and de-duplicated form. Each commit acts as an archive—kind of like a tar or zip archive—of every file as of the state that file had at the time you committed it.

Because of the special properties of these files, they literally cannot be used by your computer, except by Git itself. They must therefore be extracted, like un-archiving an archive with tar -x or unzip.
Files are stored in a usable form, as everyday files, in your working tree. This is where the extracted (unzipped, or whatever) files wind up. These files are not actually in Git at all. They are there for you to use as inputs and/or outputs, and your working tree is just an ordinary set of folders (or directories, whichever term you prefer) and files, stored in the way that is ordinary for your particular computer.²

That covers two places: so where is this "third place" I talk about? This is what Git calls, variously, the index, or the staging area, or—rarely these days—the cache. Git's index holds a third "copy" of every file. I put the word "copy" in quotes here because what's in the index is actually a sort of reference, using the de-duplication trick.

Initially, when you first use git checkout or git switch to extract a particular commit from a repository you've just cloned, what Git does is:

"copy" each file into its own index: this "copy" is in the read-only Git-only compressed-and-de-duplicated form; then
expand the file into usable form and put that into your working tree.

Note that before this step, Git's index was empty: it had no files in it at all. Now Git's index has every file from the current commit. These take no space, because they're de-duplicated and—having come out of a commit—they're all already in the repository so they are duplicates and therefore these copies use no space to hold the data.³

So: what's the point of this index / staging-area / cache? Well, one point is that it makes Git go fast. Another is that it lets you partially stage files (though I won't cover what that means here). But in fact, it's not strictly necessary: other version control systems get away without having one. It's just that Git not only has it, Git forces you to use it. So you need to know about it, if only to know that it places itself between you and your files—in your working tree—and the commits in the repository.

By omitting a few details that eventually matter, but not yet, we can describe the index pretty well as your proposed next commit. That is, the index holds each file that will go into the next commit. These files are in Git's own format—compressed and de-duplicated—but, unlike the files inside a commit, you can replace them. You can't modify them (they're in the read-only format, and pre-de-duplicated), but you can run git add.

The git add command reads the working tree copy of some file. This working tree copy is the version you see and work with. If you've changed it, git add reads the changed version.⁴ The add command compresses this data down into Git's special internal format and checks to see if it's a duplicate. If it is a duplicate, Git throws out its compression result and re-uses the existing data, updating the index with the re-used file. If it's not a duplicate, Git saves the compressed and de-duplicated (but first time now) file data and updates the index with that.

Either way, what's in the index now is the updated file. So the index now holds your proposed next commit. It held your proposed next commit before the git add too, but now your proposed next commit is updated. This tells us what the index is for from our point of view: The index holds your proposed next commit. You do not commit what is in your working tree. Instead, you commit what is in Git's index. This is why you need to know about the index: it's how Git makes new commits.

¹The commits themselves are only permanent until you or Git remove them, but in a lot of cases that's "never". It's actually kind of hard to get rid of a Git commit, for many reasons. A file's data as stored in a commit, de-duplicated, remains in the repository until every commit that holds that file is removed, though.

²The actual file storage format inside computers is itself amazingly complicated and varied. Some systems do case-preserving but case-folding in file names, for instance, so that README.md and ReadMe.md are "the same file", while others say that these are two different files. Git holds the latter opinion, and when the commit archive holds both a README.md and a ReadMe.md, and you extract that commit to your working tree, one of those files goes missing from your working tree, since it's physically incapable of holding both (because they have the "same name" as far as your computer is concerned). Because Git's archived files are in a special Git-only format, this is not a problem for Git itself. But it can be a huge headache for you.

³The other properties stored in the index—such as the cache aspect, which helps Git go fast—do take a bit of space. The average tends to somewhere close to 100 bytes per file, so unless you have a million files (which then needs ~100 MB of index), this is utterly trivial in modern systems where a chip the size of your fingernail provides 256 GB of storage.

⁴If you haven't changed it, git add tries to skip reading it, to make Git go fast. This will cause us problems in a moment. So sometimes you may find it useful to trick Git into thinking you've changed it. You can do this by rewriting the file in place, or using the touch command if you have that, for instance. The --renormalize flag to git add is supposed to fix this as well, but I have seen people say it doesn't.

How this relates to line endings

Let's review quickly now:

Every commit contains files-as-a-snapshot, in a frozen (read-only), compressed, de-duplicated format. Nothing, not even Git itself, can ever change any part of any commit.
Git makes new commits from whatever is in Git's index. Git fills in the index from a commit when you check out the commit, and builds the new commit from whatever is in its index at the time you run git commit.
Your working tree lets you see what came out of a commit: the files come out of the commit, go into Git's index, and then get copied and expanded to become ordinary files in your working tree. Your working tree lets you control what goes into a new commit: you run git add and the data get compressed, de-duplicated, and generally Git-ified and put into the index, ready to be committed.

Note that there are steps here where Git does something very easy for Git: copying a commit into the index doesn't change any of the files at all, as they're still in the special read-only, Git-only format. Making a new commit doesn't change any of the files at all: they just get packaged up into a (read-only) commit, from the (replaceable but still read-only) "copies" in the index. But there are two steps where Git does something much harder:

As a file gets copied out of the index to your working tree, it gets expanded and transformed. Git has to change from compressed bytes to uncompressed bytes. This is an ideal time to change LF-only to CRLF and this is when Git will do that, if Git does it at all.
As a file gets copied from the working tree to be compressed and Git-ified and checked if it's a duplicate, Git has to change from uncompressed bytes to compressed ones. This is an ideal time to change CRLF to LF-only and this is when Git will do that, if Git does it at all.

So it's copies in and out of the index where Git does CRLF line ending modification. Moreover, the "index -> working tree" step—which happens during git checkout, for instance—can only add CRs. It can't remove them. The "working tree -> index" step—which happens during git add, for instance—can only remove CRs, not add them.

This in turn means that, if you choose to start doing line ending transformation, the committed files inside the repository will eventually end up with LF-only line endings, over time. If some committed files have CRLF line endings now, they will, in those commits, have those endings forever, because no existing commit can be changed.

Optimizations that get in the way

Now we get to some of the optimizations:

When checking out a commit, Git tries hard not to touch the working tree if possible. This is slow! Let's not do it if we don't have to.
When using git add, Git tries hard not to touch the index if possible. It's too slow!

Suppose you check out some commit, say, deadbeef. It has 5923 files in it. Those files get "copied" into the index, which is really fast because these aren't real copies. But were there files in the index before? Say you had commit dadc0ffee out just before you switched to deadbeef. That commit had put 5752 files in the index, and then all you did was look at the working tree copies.

Obviously these files aren't all the same, but what if 5519 of the files were the same, leaving only 233 files to change and 171 new files to create. For whatever reason, there are no files in dadc0ffee that aren't in deadbeef, there are only new files. Or maybe one file does go away and Git will have to remove that one from the working tree and create 172 files. But either way, Git only needs to mess with 404 or 405 files in the working tree, not more than 5500. That's going to run about ten times faster.

So, Git does that. If Git can, it doesn't touch files. It assumes that if file path/to/file.ext in the index in commit dadc0ffee has the same raw hash ID as file path/to/file.ext in the index in commit deadbeef, it does not have to do anything to the working tree copy.

This assumption breaks down in the presence of CRLF line ending trickiness. If Git is supposed to do LF to CRLF modifications on the way out, but didn't for dadc0ffee, Git may skip doing it for deadbeef too.

What this means is that whenever you change the CRLF line endings settings, you can end up with "wrong" line endings in your working tree. You can get around this by removing the working tree copy and then checking out the file again (with git restore or git reset --hard, for instance, though remember that git reset --hard loses uncommitted work!).

Meanwhile, if you run git add on some file, and Git thinks that the cached index copy is up to date—because you haven't edited the working tree copy, for instance—Git will silently do nothing at all. But if the working tree copy has CRLF line endings, and the index (and hence future commit) copy shouldn't, this is wrong. Using git add --renormalize is supposed to get around it, or you can "touch" the file so that Git sees a newer working-tree time stamp and will redo the copy. Or, you can even run git rm --cached on the file, and then git add really does have to copy it, because there's no longer a copy of that file in the index at all.

Summary: the reason for the "simple rules" above

Using a .gitattributes file entry gives Git the most chance to get things right: Git can tell if the .gitattributes file entry affects some particular file. That gives Git the opportunity to do better cache checking, for instance. (Git currently doesn't use this opportunity properly, I think, but at least it offers the possibility.)

When you do use .gitattributes entries, they tell Git multiple things:

this file definitely is or isn't text: do, or don't, mess with it;
if you are going to mess with line endings, here's what to do.

This lets you say that *.bat files need to be CRLF-ended in the working tree, even on a Linux system, and *.sh files need to be LF-ended in the working tree, even on a Windows system.

You get as much control as Git is willing to give you:

You get the ability to turn CRLF in the working tree into LF-only in the index and hence in future commits.
You get the ability to turn LF-only in committed copies of files into CRLF in the working tree, on future extractions of this commit.

The one thing you lose is the easy and global effect of core.eol and core.autocrlf: these affect existing commits, and tell Git to guess whether each file is text. As long as Git guesses right, that tends to work sort-of-OK. It's when Git guesses wrong that things go really bad. But because these settings affect every file extraction (index-to-work-tree) and every git add (work-tree-to-index) that actually happens, and it's hard to know which ones happen, it's very hard to see what's going on.

Source https://stackoverflow.com/questions/69416842

QUESTION

Visual Studio doesn't put space in between 'internalclass' due to EditorConfig

Asked 2022-Mar-25 at 11:09

When I create a new class using Visual Studio, it produces the following:

...

ANSWER

Answered 2022-Mar-25 at 10:59

This is a known issue in Visual Studio. It has been fixed, and the fix should be included in the next Visual Studio release.

Source https://stackoverflow.com/questions/71615140

QUESTION

why I can't apply git patch when a file uses CRLF?

Asked 2022-Mar-11 at 01:08

I noticed that when I have, in my git repository, a file with CRLF line endings, and I try to create and apply a patch which changes this file, it fails.

This is a simple way to reproduce the problem:

...

ANSWER

Answered 2022-Mar-11 at 01:08

TL;DR: use git am --keep-cr.

The patch itself, in 0001-second-commit.patch, says, in effect: expect the last line to read bar plus a carriage return; add after that another line, also with a carriage return, but the "mail splitting" process that git am uses on a mailbox removes both carriage returns. Hence the internal git apply step that git am runs at this point effectively goes: Whoa, wait, hold on a minute! The original text here doesn't match! I'd better fail and make the human help out.

To fix this, tell git am to invoke the mail splitting process differently. By default, git mailsplit treats carriage returns at ends of lines as mistakes made by some email software, and removes them. The --keep-cr option tells git mailsplit that no, it's not a mistake: please keep those carriage turns. This option is also offered by git am itself, and git am (but not git mailsplit) has a configuration knob to turn it on by default: am.keepcr can be set to true.

If you've configured am.keepcr to true and want to override it temporarily, git am has --no-keep-cr as well.

Source https://stackoverflow.com/questions/71423750

QUESTION

Is there a Powershell command that will print a specified file's EOL characters?

Asked 2022-Mar-04 at 19:44

I have four text files in the following directory that have varying EOL characters:

C:\Sandbox 1.txt, 2.txt, 3.txt, 4.txt

I would like to write a powershell script that will loop through all files in the directory and find the EOL characters that are being used for each file and print them into a new file named EOL.txt

Sample contents of EOL.txt:

...

ANSWER

Answered 2022-Mar-04 at 19:44

Try the following:

Source https://stackoverflow.com/questions/71342807

QUESTION

Which is the character in python?

Asked 2022-Feb-25 at 22:53

Im communicating with physical equipments via python socket. Im kinda used to doing it, but not an expert. The commands which i send to the equipments are usually found in the equipment's manuals. Usually, commands end with CR or LF (or both) and work well when i do something like:

mysocket.send((command+"\r\n").encode())

With "\r\n" acting as CRLF (although as this answer says, CRLF and \r\n are not exactely the same thing)

I found an equipment which needs to end commands with ENQUIRY char which, according to the manual, has hexadecimal code equal to 05 in the ASCII table (the manual also says the hexadecimal code of CR and LF which are 0D and 0A, so i think its correct).

Although i have looked into the ASCII table, it has no char related to this or even CR and LF. Is there a different table for chars used with these meanings in different languages? What should i put inside an encoded string to match the meaning in python?

...

ANSWER

Answered 2022-Feb-25 at 22:53

You can use

Source https://stackoverflow.com/questions/71272499

QUESTION

Windows global gitignore not working... "Untracked files"

Asked 2022-Jan-25 at 02:35

Despite having a global gitignore on Windows OS, I am still getting a bunch of "Untracked files".

Below is a truncated list of what it looks like:

...

ANSWER

Answered 2022-Jan-25 at 00:42

Your problem is that your ignore file is in UTF-16. While UTF-16 is common on Windows, it is practically unused elsewhere outside of some programming languages, and UTF-8 has mostly supplanted it. Git accepts ignore files only in UTF-8 or other ASCII-compatible encodings (in the event your patterns contain non-UTF-8 characters), so you'll need to change your file to be in the proper format for it to work.

I would also recommend using LF endings, although I don't believe those are absolutely required for it to work.

Source https://stackoverflow.com/questions/70841698

QUESTION

Cyrillic characters are distorted when sending a letter

Asked 2022-Jan-22 at 12:53

When ending a letter to Outlook mail from an Oracle DBMS that contains Cyrillic characters, the output is question marks. I do not understand how to convert the string so that the text is readable. I ship using the following method:

...

ANSWER

Answered 2022-Jan-21 at 13:22

You need to add lines like this:

Source https://stackoverflow.com/questions/70801787

QUESTION

Prettier on git commit hook shows code style issues, but only CRLF differences

Asked 2022-Jan-20 at 06:47

I'm using Prettier In my TypeScript project. I format all files on save in Visual Studio Code. I also configured a pre-commit git hook with Husky which checks the formatting of all files.

This is how my pre-commit hook file looks like:

...

ANSWER

Answered 2022-Jan-20 at 06:47

After digging the issue more and thanks to comments from @eDonkey and @torek, I found a solution. I tested it for 2 days and it seems to work.

Please note, that this solution probably works for a project with Windows-only developers. If all of you are on Mac/Linux, you'd probably want to use LF instead of CRLF. If the team is mixed, I'm not sure there's a perfect solution here.

First, configure git to not do CRLF -> LF conversion:

Source https://stackoverflow.com/questions/70750342

QUESTION

What is the difference between "* text=auto" and "* text=auto eol=lf"?

Asked 2022-Jan-10 at 23:27

I was reading about the .gitattributes file and the rule to force line endings in some tutorials it's written like * text=auto and in some others, it's like * text=auto eol=lf at the first line of the file.

Are there any differences? what does the first one exactly do? Does it even force any line endings?

Also in some repositories it's mentioned that * text=auto preforms LF normalization! I don't know whether it's true or not.

...

ANSWER

Answered 2022-Jan-10 at 23:27

There's a difference between these attributes. text asks Git to perform line ending conversion. Any time Git does this, it will store LF endings in the repository, and it will convert them when it checks files out in the working tree. text=auto asks Git to search the beginning of the file for a NUL byte, and if it finds one, then the file is binary and conversions are not performed; otherwise, the file is text, and conversions are performed. This usually works fine in most cases, and is a sensible default.

By default, Git honors several configuration variables to decide what line ending conversion should be used in the working tree (LF or CRLF), unless the eol attribute is set. If eol is set, then (a) the file is automatically set to be text and (b) that line ending is always used.

So in the former case, * text=auto says, "Guess whether this is a text file, and if it is, check this file out with the user's preferred line endings." The eol=lf applies only to files that are guessed as text in this case, as of Git 2.10. In general, eol applies if text is set explicitly, text=auto is set and the file is detected as text, or if text is left unspecified; in Git 2.10 and newer, it doesn't affect files explicitly marked -text or detected as binary with text=auto.

However, if you're using older versions of Git, this can cause some binary files to be mishandled, since it will force them to always be text. If your repository contains only text files, then it will work, but this is better written as * text eol=lf. Otherwise, you can specify different types of files separately:

Source https://stackoverflow.com/questions/70633469

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install crlf

You can download it from GitHub.

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: