How to Split Files on Linux Using csplit Command

csplit is a popular utility in Linux that splits the given text file into multiple individual files. The file to split must be a text file. By default, the csplit command names the output files as “xx00”, “xx01”, “xx02”, and so on. However, we can also specify a prefix of our choice (if needed). This command line utility can accept different flags or options to customize/modify the output according to users’ needs.

Splitting Files on Linux Using csplit

The csplit command in Linux refers to “split by context”. This command lets us split or break a file into sections based on context lines, such as line numbers or patterns.

csplit Command Basic Syntax

To use csplit command in Linux, you must follow the below syntax:

Here, “OPTION” represents optional flags or arguments used to customize the behavior of the csplit command. “fileName” is a target text file that needs to be split into several sections/subfiles. “PATTERN” represents a pattern or line number based on which the given file will split.

csplit Command Options

The csplit command can accept different options to perform various functionalities. Some widely used options of the csplit command are listed below:

OptionDescription
-b, suffix-formatIt specifies the sprintf FORMAT in place of “%02d”.
-f, –prefixIt uses a user-defined prefix for the output files instead of the default prefix “xx”.
-k, –keep-filesIt doesn’t remove the output file if an error occurs.
–suppress-matchedIt skips a specific line while splitting the target file.
-n, –digitsIt lets us modify the number of digits in the file name instead of the default 2.
-s, –quiet, –silentIt ensures file splits silently without mentioning output file sizes.
-z, –elide-empty-filesIt ensures the removal of empty output files.
–helpIt shows the help page of the csplit command, containing information about command usage, option details, pattern details, and more.
–versionIt retrieves the installed version of the csplit command.

csplit Command Man Page

For more detailed information, you can access the official manual page of the csplit command by running the following command in the terminal:

csplit Command Man Page

csplit Installation on Linux

Generally, the csplit command comes pre-installed on all Linux distributions. However, for some reason, if this command is not installed on your system, you can easily install it using a package manager like apt, yum, dnf, etc., depending upon your Linux distribution. The below snippet shows how to install csplit command on different Linux distributions:

How Does csplit Work on Linux?

Let’s look at some examples to understand how the csplit command operates in Linux, with or without options. For this purpose, first, create a text file and specify some content in it. The following shows the content of a sampleFile.txt:

csplit Installation on Linux

In the following examples, we’ll use the csplit command to split this file into multiple sections.

Example 1: Splitting a File Using csplit Command

Let’s run the csplit command without any option to see how it splits the specified file by default:

This command splits the sampleFile.txt from the third line, as shown in the following screenshot:

Splitting a File Using csplit Command

You can view the content of split files using the cat command as follows:

The output confirms that the selected file has been split into two files. The content of the second file “xx01” starts from the third line of sampleFile as specified in the csplit command:

Splitting a File Using csplit Command

Example 2: Splitting a File With Specific Prefixes

We can use the “-f” flag with the csplit command to specify prefixes of our choices in the file names. For instance, in the following example, we use “file” as a prefix instead of the default “xx”:

From the output you can see that this time split files are prefixed with “file” instead of “xx”:

Splitting a File With Specific Prefixes

Example 3: Keeping the Output Files When Some Errors Occur

By default, when an error occurs in the csplit command, it removes the output files. However, if we specify the -k option with the csplit command, it doesn’t remove output files. For instance, we run the below-mentioned erroneous command:

The following output shows that an error occurs while executing the csplit command. However, the csplit command keeps the split files because of the -k option:

Splitting a File With Specific Prefixes

Example 4: Modifying the Number of Digits in the Split Filenames

The -n option with the csplit command lets us specify the number of digits to be used in the output (split) filenames. By default, csplit uses two digits (i.e., xx00, xx01), but we can modify this to use more or fewer digits as needed. For instance, in the following command, we specify -n 3, which ensures that the split filenames will have three digits:

Modifying the Number of Digits in the Split Filenames

Example 5: Split Files Silently Without Showing the Sizes

In the previous example, you may have noticed that the csplit command shows the size count of the output files. If we don’t want to see these size counts, we can use the -s option with the csplit command, as follows:

The output ensures that this time the csplit command executes silently:

Modifying the Number of Digits in the Split Filenames

Example 5: Omitting a Specific Line From the Output Files

If we need to skip specific lines while splitting a file into sections, we should use the –suppress-matched option with the csplit command. For instance, the following command omits line 3 from the split output files:

The output confirms that the csplit command skips the third line from the output files:

Omitting a Specific Line From the Output Files

Example 6: Checking the csplit Version

To check which version of the csplit command you are using, run the csplit command with the –version option, as follows:

Omitting a Specific Line From the Output Files

Example 7: Accessing the csplit Command Help Page

To learn more about the available options and their working, run the csplit command with the “–help” option, as follows:

Accessing the csplit Command Help Page

Difference Between split and csplit Command

In Linux, both the split and csplit commands are used to split files into smaller files. However, they differ in their functionality. The split command divides files based on byte size or line count, which is useful for splitting files into fixed-size chunks but does not consider content. In contrast, the csplit command can split files based on content patterns or specific line numbers, making it useful for scenarios where you need to divide a file according to its content rather than its size.

Conclusion

csplit is a commonly used command in Linux that splits text files into smaller sections based on specific patterns or line numbers. The csplit command offers various options and flags that are used to customize its behavior. Using these options, we can change output file prefixes, modify filename digits, suppress error messages, etc. This tutorial covered the basic syntax of the csplit command, its various options, practical examples, and how to customize its behavior.

Scroll to Top