Re: view-fuzzy script

trans-coord-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: view-fuzzy script

From:	Ineiev
Subject:	Re: view-fuzzy script
Date:	Wed, 16 Nov 2011 15:49:48 +0000
User-agent:	Thunderbird 2.0.0.14 (X11/20080501)

On 10/20/2011 05:24 AM, Yavor Doganov wrote:

(1) use my sed-based script;


If it works, why not.  We don't need so many options, BTW.


I attach a prototype. It processes all PO files in a working copy
of www within ten minutes, and the output seems to contain nothing
criminal, so I think it is fast and reliable enough.

(2) write a patch (or submit a feature request) for msgmerge
     to provide an option to include a wdiff to previous translation.


This is the best option, but gettext is in C, so it's probably quite
some work.  Worth investigating, at least.


After some discussion on bug-gettext@, it become clear that, on the one
hand, it would need more additional code than just adding the diffs
(gettext will need to learn how to read those diffs, undiff them (which
would require a diff tool different from wdiff), and perhaps something
else); on the other hand, this would be an incompatible change in the PO
format, so we won't be able to use it for a long time --- until virtually
all PO editors are aware of the feature.

#! /bin/sed -nf

# Copyright (C) 2011 Free Software Foundation, Inc.

# This file is part of GNUnited Nations.

# GNUnited Nations is free software: you can redistribute it and/or
# modify it under the terms of the GNU General Public License as
# published by the Free Software Foundation, either version 3 of the
# License, or (at your option) any later version.

# GNUnited Nations is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with GNUnited Nations.  If not, see <http://www.gnu.org/licenses/>.

# Prepend fuzzy translations with wdiff from the previous msgid
# to current msgid.  The difference is put in comment lines beginning
# with `# | '; if such comments are present in the PO file, they are
# removed by the script.

# Remove old difference.
/^# |[| ]/d;
# Add new difference.
/^#, fuzzy\>/{
# Read previous and current msgids.
  N; /\n#| msgid "/! b obsolete;
  :read-previous;
  N; /\n#| [^\n]*$/ b read-previous;
  /\nmsgid "[^\n]*$/! b obsolete;
  :read-current;
  N; /\nmsgstr "/ b create-file;
  /\n"[^\n]*$/b read-current; b obsolete;
# Create a temporary file.  Its name shall be kept in the last line
# of both pattern and hold space.
  :create-file;
# It seems to be considerably faster to create a new file than
# to rewrite an existing one on fencepost (?)
  x; s%.*%mktemp addiff.XXXXXXXXXXXXX%e;
  H; g;
# Cut out the lines with current msgid.
  s/\nmsgid ".*\(\n[^\n]*\)$/\1/;
# Extract previous msgid.
  s/^#, fuzzy\>[^\n]*\n//; s/^#| msgid "\([^\n]*\)"[^\n]*\n/\1\n/;
  s/\nmsgid [^\n]*\n/\n/;
  s/\(" *\)\?\n#| "//g;
  s/" *\n/\n/; s/^\n*//; 
# Escape single quotes to workaround command line.
  s/'/'"'"'/g;
# Escape `\n's to workaround wdiff.
  s/\\n/\\\\n/g;
# Output previous msgid to temporary file.
  s%^\(.*\)\n%echo '\1' > %e;
# Extract current msgid.
  g; s/\nmsgstr ".*\n/\n/; s/^.*\nmsgid "\([^\n]*\)"/\1/;
  s/\(" *\)\?\n"//g; s/" *\n/\n/; s/^\n*//; s/'/'"'"'/g; s/\\n/\\\\n/g;
# Invoke diff program and remove the temporary file.
  s%\(.*\)\n\([^\n]*\)$%echo '\1' | wdiff \2 -; rm \2%e;
  s/\n//g; t rm-duplicates;
# A little bit of postprocessing.
  :rm-duplicates; s/\[-\(.*\)-\] *{+\1+}/\1/; t rm-duplicates;
  :merge-tails;
  s/\[-\(.\)\?\(.\+\)-\] *{+\(.\)\?\2+}/\[-\1-\]{+\3+}\2/;
  t merge-tails;
  :merge-heads;
  s/\[-\(.\+\)\(.\)\?-\] *{+\1\(.\)\?+}/\1\[-\2-\]{+\3+}/;
  t merge-heads;
  :merge-wings;
  s/\[-\(.\+\)\(.\)\?\(.\+\)-\] *{+\1\(.\)\?\3+}/\1\[-\2-\]{+\4+}\3/;
  t merge-wings;
  s/\[--\]//g; s/{++}//g;
  /{+.*+}/ b format; /\[-.*-\]/ b format;
  s/.*/# || No change detected.  The change might only be in amounts of spaces./
  b output-diff;
  :format;
# fmt -w 75 the result.
# Remove possible long word beginning the string
  s/^\([^ ]\{75,\}\) \(.*\)$/\1\n\2/;
  :fmt;
# Long word; leave on the line.
  s/\(\n[^\n ]\{75,\}\) \([^\n]*\)$/\1\n\2/; t fmt;
# There is a space: split the line.
  s/\([^\n]\{75\}\)\([^\n]\+\)$/\1\n\2/; t proceed;
# No more characters.
  b fmt-done;
  :proceed;
# Move the newline to the last space in the line.
  s/ \([^\n ]*\)\n\([^\n]*\)$/\n\1\2/; b fmt;
  :fmt-done;
# Prepend every line with `# | '.
  s/\(^\|\n\)/\1# | /g;
  :output-diff; p;
# Pass original content to output.
  g; s/\n\([^\n]*\)$//
  :obsolete
# We've got something different, e.g. obsolete msg or no previous msgid;
# just pass what we have.
}
# Otherwise, just print the input.
p

[Prev in Thread]

Current Thread

[Next in Thread]

Re: view-fuzzy script, Ineiev <=

Prev by Date: trans-coord/gnun/licenses gpl-faq.html
Next by Date: trans-coord/gnun/server body-include-2.html
Previous by thread: trans-coord/gnun/licenses gpl-faq.html
Next by thread: trans-coord/gnun/server body-include-2.html
Index(es):
- Date
- Thread