[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lmi-commits] [lmi] master 7d88df5 2/3: Do not let files end in backslas
From: |
Greg Chicares |
Subject: |
[lmi-commits] [lmi] master 7d88df5 2/3: Do not let files end in backslash-newline |
Date: |
Tue, 19 May 2020 11:41:47 -0400 (EDT) |
branch: master
commit 7d88df5a637e856185cbb8fd01f617853beb17fe
Author: Gregory W. Chicares <address@hidden>
Commit: Gregory W. Chicares <address@hidden>
Do not let files end in backslash-newline
The 'read' documentation for POSIX (IEEE 1003.1-2017) says:
| Although the standard input is required to be a text file, and
| therefore will always end with a <newline> (unless it is an empty
| file), the processing of continuation lines when the -r option is
| not used can result in the input not ending with a <newline>.
| This occurs if the last line of the input file ends with a
| <backslash> <newline>.
It seems best to follow the C rules and disallow <backslash> <newline>
at the end of any file.
---
objects.make | 2 ++
test_coding_rules.cpp | 17 ++++++++++++-----
test_coding_rules_test.sh | 11 +++++++++++
3 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/objects.make b/objects.make
index 1f58504..0b7e613 100644
--- a/objects.make
+++ b/objects.make
@@ -1208,3 +1208,5 @@ product_files$(EXEEXT): \
my_rnd.o \
my_tier.o \
liblmi$(SHREXT) \
+
+# This file does not end in backslash-newline.
diff --git a/test_coding_rules.cpp b/test_coding_rules.cpp
index e851611..a300e8e 100644
--- a/test_coding_rules.cpp
+++ b/test_coding_rules.cpp
@@ -26,6 +26,7 @@
#include "istream_to_string.hpp"
#include "main_common.hpp"
#include "miscellany.hpp" // begins_with(), split_into_lines()
+#include "ssize_lmi.hpp"
#include <boost/filesystem/convenience.hpp> // fs::extension()
#include <boost/filesystem/fstream.hpp>
@@ -117,8 +118,9 @@ class file final
/// an exception for empty files, but there's no reason for lmi to
/// have any.
///
-/// Add a '\n' sentry at the beginning of the string for the reason
-/// explained in 'regex_test.cpp'.
+/// Add a newline at the beginning of the string, and require
+/// a newline at the end, so that "\n" can be used in regexen
+/// instead of '^' and '$' anchors--see 'regex_test.cpp'.
///
/// Files
/// ChangeLog-2004-and-prior *.txt *.xpm
@@ -223,10 +225,15 @@ file::file(std::string const& file_path)
}
data_ = '\n' + data();
- // The '\n' sentinel just added makes back() safe for 0-byte files:
- if('\n' != data().back())
+
+ int const datasize = lmi::ssize(data());
+ if(1 <= datasize && '\n' != data_[datasize - 1])
+ {
+ throw std::runtime_error(R"(File does not end in newline.)");
+ }
+ if(2 <= datasize && '\\' == data_[datasize - 2])
{
- throw std::runtime_error(R"(File does not end in '\n'.)");
+ throw std::runtime_error(R"(File ends in backslash-newline.)");
}
}
diff --git a/test_coding_rules_test.sh b/test_coding_rules_test.sh
index 79c50f5..8b7c8b4 100755
--- a/test_coding_rules_test.sh
+++ b/test_coding_rules_test.sh
@@ -48,6 +48,13 @@ cat >eraseme_000 <<EOF
$boilerplate
EOF
+touch eraseme_0_bytes.touchstone
+
+printf '\n' >eraseme_1_byte_good.touchstone
+printf ' ' >eraseme_1_byte_bad.touchstone
+printf 'z\n' >eraseme_2_bytes_good.touchstone
+printf '\\\n' >eraseme_2_bytes_bad.touchstone
+
# Files in general: copyright.
cat >eraseme_copyright_000 <<EOF
@@ -389,6 +396,8 @@ Exception--file 'a_nonexistent_file': File not found.
File 'an_expungible_file.bak' ignored as being expungible.
Exception--file 'an_unexpected_file': File is unexpectedly uncategorizable.
Exception--file 'another.unexpected.file': File is unexpectedly
uncategorizable.
+Exception--file 'eraseme_1_byte_bad.touchstone': File does not end in newline.
+Exception--file 'eraseme_2_bytes_bad.touchstone': File ends in
backslash-newline.
File 'eraseme_copyright_001' lacks current copyright.
File 'eraseme_copyright_001' breaks taboo '\(c\) *[0-9]'.
File 'eraseme_copyright_003.html' lacks current copyright.
@@ -451,3 +460,5 @@ diff --unified=0 expected_eraseme observed_eraseme && rm
--force \
another.unexpected.file \
eraseme* \
./*eraseme \
+
+# This file does not end in backslash-newline.