模块 Shellwords

像 UNIX Bourne shell 一样操作字符串¶ ↑

此模块根据 UNIX Bourne shell 的词语解析规则操作字符串。

shellwords() 函数最初是 shellwords.pl 的移植版本，但经过修改以符合 IEEE Std 1003.1-2008, 2016 版 [1] 的 Shell & Utilities 卷。

用法¶ ↑

您可以使用 Shellwords 将字符串解析为 Bourne shell 友好的 Array。

require 'shellwords'

argv = Shellwords.split('three blind "mice"')
argv #=> ["three", "blind", "mice"]

在您引入 Shellwords 后，您可以使用 split 别名 String#shellsplit。

argv = "see how they run".shellsplit
argv #=> ["see", "how", "they", "run"]

它们将引号视为特殊字符，因此不匹配的引号会导致 ArgumentError。

argv = "they all ran after the farmer's wife".shellsplit
     #=> ArgumentError: Unmatched quote: ...

Shellwords 还提供执行相反操作的方法。 Shellwords.escape 或其别名 String#shellescape 会转义字符串中的 shell 元字符，以便在命令行中使用。

filename = "special's.txt"

system("cat -- #{filename.shellescape}")
# runs "cat -- special\\'s.txt"

请注意“–”。如果没有它，cat(1) 会将以下参数视为命令行选项，如果它以“-”开头。保证 Shellwords.escape 将字符串转换为 Bourne shell 将解析回原始字符串的形式，但程序员有责任确保将任意参数传递给命令不会造成任何损害。

Shellwords 还附带一个针对 Array 的核心扩展，Array#shelljoin。

dir = "Funny GIFs"
argv = %W[ls -lta -- #{dir}]
system(argv.shelljoin + " | less")
# runs "ls -lta -- Funny\\ GIFs | less"

您可以使用此方法从参数数组构建完整的命令行。

作者¶ ↑

Wakou Aoyama
Akinori MUSHA <[email protected]>

联系¶ ↑

Akinori MUSHA <[email protected]> (当前维护者)

资源¶ ↑

1: IEEE Std 1003.1-2008，2016 版，Shell & Utilities 卷

常量

版本

公共类方法

escape(str)

别名：shellescape

join(array)

别名：shelljoin

shellescape(str) 点击切换源代码

转义字符串，使其可以在 Bourne shell 命令行中安全使用。str 可以是非字符串对象，只要它响应 to_s。

请注意，结果字符串应该不加引号使用，不建议在双引号或单引号中使用。

argv = Shellwords.escape("It's better to give than to receive")
argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"

String#shellescape 是此函数的简写。

argv = "It's better to give than to receive".shellescape
argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"

# Search files in lib for method definitions
pattern = "^[ \t]*def "
open("| grep -Ern -e #{pattern.shellescape} lib") { |grep|
  grep.each_line { |line|
    file, lineno, matched_line = line.split(':', 3)
    # ...
  }
}

调用者有责任使用正确的编码对字符串进行编码，以便在使用该字符串的 shell 环境中使用。

多字节字符被视为多字节字符，而不是字节。

如果 str 的长度为零，则返回一个空引号的 String。

# File lib/shellwords.rb, line 150
def shellescape(str)
  str = str.to_s

  # An empty argument will be skipped, so return empty quotes.
  return "''".dup if str.empty?

  str = str.dup

  # Treat multibyte characters as is.  It is the caller's responsibility
  # to encode the string in the right encoding for the shell
  # environment.
  str.gsub!(/[^A-Za-z0-9_\-.,:+\/@\n]/, "\\\\\\&")

  # A LF cannot be escaped with a backslash because a backslash + LF
  # combo is regarded as a line continuation and simply ignored.
  str.gsub!(/\n/, "'\n'")

  return str
end

也称为：escape

shelljoin(array) 点击切换源代码

从参数列表 array 构建一个命令行字符串。

所有元素都连接成一个字符串，字段之间用空格分隔，每个元素都针对 Bourne shell 进行转义，并使用 to_s 字符串化。

ary = ["There's", "a", "time", "and", "place", "for", "everything"]
argv = Shellwords.join(ary)
argv #=> "There\\'s a time and place for everything"

Array#shelljoin 是此函数的快捷方式。

ary = ["Don't", "rock", "the", "boat"]
argv = ary.shelljoin
argv #=> "Don\\'t rock the boat"

您也可以在元素中混合非字符串对象，如 Array#join 中所允许的那样。

output = `#{['ps', '-p', $$].shelljoin}`

# File lib/shellwords.rb, line 196
def shelljoin(array)
  array.map { |arg| shellescape(arg) }.join(' ')
end

也称为：join

shellsplit(line) 点击切换源代码

以与 UNIX Bourne shell 相同的方式将字符串拆分为标记数组。

argv = Shellwords.split('here are "two words"')
argv #=> ["here", "are", "two words"]

但是，请注意，这不是命令行解析器。除了单引号、双引号和反斜杠之外的 shell 元字符不会被视为元字符。

argv = Shellwords.split('ruby my_prog.rb | less')
argv #=> ["ruby", "my_prog.rb", "|", "less"]

String#shellsplit 是此函数的快捷方式。

argv = 'here are "two words"'.shellsplit
argv #=> ["here", "are", "two words"]

# File lib/shellwords.rb, line 90
def shellsplit(line)
  words = []
  field = String.new
  line.scan(/\G\s*(?>([^\s\\\'\"]+)|'([^\']*)'|"((?:[^\"\\]|\\.)*)"|(\\.?)|(\S))(\s|\z)?/m) do
    |word, sq, dq, esc, garbage, sep|
    raise ArgumentError, "Unmatched quote: #{line.inspect}" if garbage
    # 2.2.3 Double-Quotes:
    #
    #   The <backslash> shall retain its special meaning as an
    #   escape character only when followed by one of the following
    #   characters when considered special:
    #
    #   $ ` " \ <newline>
    field << (word || sq || (dq && dq.gsub(/\\([$`"\\\n])/, '\\1')) || esc.gsub(/\\(.)/, '\\1'))
    if sep
      words << field
      field = String.new
    end
  end
  words
end

也称为：shellwords，split

shellwords(line)

别名：shellsplit

split(line)

别名：shellsplit

私有实例方法

shellescape(str) 点击切换源代码

转义字符串，使其可以在 Bourne shell 命令行中安全使用。str 可以是非字符串对象，只要它响应 to_s。

请注意，结果字符串应该不加引号使用，不建议在双引号或单引号中使用。

argv = Shellwords.escape("It's better to give than to receive")
argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"

String#shellescape 是此函数的简写。

argv = "It's better to give than to receive".shellescape
argv #=> "It\\'s\\ better\\ to\\ give\\ than\\ to\\ receive"

# Search files in lib for method definitions
pattern = "^[ \t]*def "
open("| grep -Ern -e #{pattern.shellescape} lib") { |grep|
  grep.each_line { |line|
    file, lineno, matched_line = line.split(':', 3)
    # ...
  }
}

调用者有责任使用正确的编码对字符串进行编码，以便在使用该字符串的 shell 环境中使用。

多字节字符被视为多字节字符，而不是字节。

如果 str 的长度为零，则返回一个空引号的 String。

# File lib/shellwords.rb, line 150
def shellescape(str)
  str = str.to_s

  # An empty argument will be skipped, so return empty quotes.
  return "''".dup if str.empty?

  str = str.dup

  # Treat multibyte characters as is.  It is the caller's responsibility
  # to encode the string in the right encoding for the shell
  # environment.
  str.gsub!(/[^A-Za-z0-9_\-.,:+\/@\n]/, "\\\\\\&")

  # A LF cannot be escaped with a backslash because a backslash + LF
  # combo is regarded as a line continuation and simply ignored.
  str.gsub!(/\n/, "'\n'")

  return str
end

也称为：escape

shelljoin(array) 点击切换源代码

从参数列表 array 构建一个命令行字符串。

所有元素都连接成一个字符串，字段之间用空格分隔，每个元素都针对 Bourne shell 进行转义，并使用 to_s 字符串化。

ary = ["There's", "a", "time", "and", "place", "for", "everything"]
argv = Shellwords.join(ary)
argv #=> "There\\'s a time and place for everything"

Array#shelljoin 是此函数的快捷方式。

ary = ["Don't", "rock", "the", "boat"]
argv = ary.shelljoin
argv #=> "Don\\'t rock the boat"

您也可以在元素中混合非字符串对象，如 Array#join 中所允许的那样。

output = `#{['ps', '-p', $$].shelljoin}`

# File lib/shellwords.rb, line 196
def shelljoin(array)
  array.map { |arg| shellescape(arg) }.join(' ')
end

也称为：join

shellsplit(line) 点击切换源代码

以与 UNIX Bourne shell 相同的方式将字符串拆分为标记数组。

argv = Shellwords.split('here are "two words"')
argv #=> ["here", "are", "two words"]

但是，请注意，这不是命令行解析器。除了单引号、双引号和反斜杠之外的 shell 元字符不会被视为元字符。

argv = Shellwords.split('ruby my_prog.rb | less')
argv #=> ["ruby", "my_prog.rb", "|", "less"]

String#shellsplit 是此函数的快捷方式。

argv = 'here are "two words"'.shellsplit
argv #=> ["here", "are", "two words"]

# File lib/shellwords.rb, line 90
def shellsplit(line)
  words = []
  field = String.new
  line.scan(/\G\s*(?>([^\s\\\'\"]+)|'([^\']*)'|"((?:[^\"\\]|\\.)*)"|(\\.?)|(\S))(\s|\z)?/m) do
    |word, sq, dq, esc, garbage, sep|
    raise ArgumentError, "Unmatched quote: #{line.inspect}" if garbage
    # 2.2.3 Double-Quotes:
    #
    #   The <backslash> shall retain its special meaning as an
    #   escape character only when followed by one of the following
    #   characters when considered special:
    #
    #   $ ` " \ <newline>
    field << (word || sq || (dq && dq.gsub(/\\([$`"\\\n])/, '\\1')) || esc.gsub(/\\(.)/, '\\1'))
    if sep
      words << field
      field = String.new
    end
  end
  words
end

也称为：shellwords，split

shellwords(line)

别名：shellsplit