netwjx

混乱与有序

Ruby 和 Jekyll 的笔记

| 评论

之前发现Octopress产生的页面中, meta标签的content属性没有处理换行, 今天尝试自己写个插件来处理这个地方, 因为没有学过Ruby, 下面的操作基本都是临时找资料, 所以记录一些重点.

插件代码如下

plugins/html_attr_filter.rb
1
2
3
4
5
6
7
8
9
10
# coding: utf-8

#html attribute filter
module HtmlAttrFilters
    def html_attr(input)
        input.gsub(/\r\n|\r|\n/, "\r\n"=>'
', "\r"=>'
', "\n"=>'
')
    end
end

Liquid::Template.register_filter HtmlAttrFilters

修改source/_includes/head.html<meta name="description"所在的行

source/_includes/head.html
1
  <meta name="description" content="{{ description | strip_html | condense_spaces | truncate:150 | html_attr }}">

然后rake generate就能看到<meta name="description"content已经不会有换行了, 下面说说中间涉及的相关东西.

Jekyll扩展和Liquid扩展

Octopress是基于Jekyll的, Jekyll使用的模版引擎是Liquid, 在模版中{{ a | foo | bar}}foobar叫做Filter, 后面将把其称为过滤器, 在Jekyll插件开发文档中有一段是关于过滤器扩展, 我主要是参考这里来做文章开始的扩展.

Liquid filters

You can add your own filters to the liquid system much like you can add tags above. Filters are simply modules that export their methods to liquid. All methods will have to take at least one parameter which represents the input of the filter. The return value will be the output of the filter.

1
2
3
4
5
6
7
8
9
module Jekyll
  module AssetFilter
    def asset_url(input)
      "http://www.example.com/#{input}?#{Time.now.to_i}"
    end
  end
end

Liquid::Template.register_filter(Jekyll::AssetFilter)

Advanced: you can access the site object through the @context.registers feature of liquid. Registers a hash where arbitrary context objects can be attached to. In Jekyll you can access the site object through registers. As an example, you can access the global configuration (_config.yml) like this: @context.registers[:site].config['cdn'].

延伸: Octopress Jekyll和Liquid所有可用的过滤器

Octopress扩展的过滤器在这里, 主要是从36行开始的这些:

plugins/octopress_filters.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
module OctopressLiquidFilters
  include Octopress::Date

  # Used on the blog index to split posts on the <!--more--> marker
  def excerpt(input)
    if input.index(/<!--\s*more\s*-->/i)
      input.split(/<!--\s*more\s*-->/i)[0]
    else
      input
    end
  end

  # Checks for excerpts (helpful for template conditionals)
  def has_excerpt(input)
    input =~ /<!--\s*more\s*-->/i ? true : false
  end

  # Summary is used on the Archive pages to return the first block of content from a post.
  def summary(input)
    if input.index(/\n\n/)
      input.split(/\n\n/)[0]
    else
      input
    end
  end

  # Extracts raw content DIV from template, used for page description as 
  # contains complete sub-template code on main page level
  def raw_content(input)
    /<div class="entry-content">(?<content>[\s\S]*?)<\/div>\s*<(footer|\/article)>/ =~ input
    return (content.nil?) ? input : content
  end

  # Escapes CDATA sections in post content
  def cdata_escape(input)
    input.gsub(/<!\[CDATA\[/, '&lt;![CDATA[').gsub(/\]\]>/, ']]&gt;')
  end

  # Replaces relative urls with full urls
  def expand_urls(input, url='')
    url ||= '/'
    input.gsub /(\s+(href|src)\s*=\s*["|']{1})(\/[^\"'>]*)/ do
      $1+url+$3
    end
  end

  # Improved version of Liquid's truncate:
  # - Doesn't cut in the middle of a word.
  # - Uses typographically correct ellipsis (…) insted of '...'
  def truncate(input, length)
    if input.length > length && input[0..(length-1)] =~ /(.+)\b.+$/im
      $1.strip + ' &hellip;'
    else
      input
    end
  end

  # Improved version of Liquid's truncatewords:
  # - Uses typographically correct ellipsis (…) insted of '...'
  def truncatewords(input, length)
    truncate = input.split(' ')
    if truncate.length > length
      truncate[0..length-1].join(' ').strip + ' &hellip;'
    else
      input
    end
  end

  # Condenses multiple spaces and tabs into a single space
  def condense_spaces(input)
    input.gsub(/\s{2,}/, ' ')
  end

  # Removes trailing forward slash from a string for easily appending url segments
  def strip_slash(input)
    if input =~ /(.+)\/$|^\/$/
      input = $1
    end
    input
  end

  # Returns a url without the protocol (http://)
  def shorthand_url(input)
    input.gsub /(https?:\/\/)(\S+)/ do
      $2
    end
  end

  # Returns a title cased string based on John Gruber's title case http://daringfireball.net/2008/08/title_case_update
  def titlecase(input)
    input.titlecase
  end

end
Liquid::Template.register_filter OctopressLiquidFilters

def后的名称即过滤器的名称.

Jekyll扩展的过滤器在这里

Liquid的标准过滤器在这里

Ruby的字符串和正则

Ruby的字符串可以使用双引号"foo bar", 也可以使用单引号'foo bar', 区别是:

  • 双引号中可以使用\r\n等转义符号, 以及#{bar}来引入一个变量的值, bar表示一个变量名.
  • 单引号会将所有的字符原样保留, 包括\r\n, 其等价于双引号的\\r\\n.

Ruby字符串的替换可以使用gsub方法, 类似一般语言中的replace, 第一个参数仍旧可以为正则, Ruby的文档中代码示例如下:

String#gsub
1
2
3
4
5
"hello".gsub(/[aeiou]/, '*')                  #=> "h*ll*"
"hello".gsub(/([aeiou])/, '<\1>')             #=> "h<e>ll<o>"
"hello".gsub(/./) {|s| s.ord.to_s + ' '}      #=> "104 101 108 108 111 "
"hello".gsub(/(?<foo>[aeiou])/, '{\k<foo>}')  #=> "h{e}ll{o}"
'hello'.gsub(/[eo]/, 'e' => 3, 'o' => '*')    #=> "h3ll*"

我没有完整的看Ruby的语言规范, 根据文档的描述, 示例代码中最后一行'e' => 3, 'o' => '*'叫做Hash.

另外一个方法subgsub区别在: sub只会替换一次, gsub会替换所有的.

更多资料:

Ruby中正则的使用感觉和Javascript的十分象, 当然也有一些其它的语法, 详细参考Ruby类库参考 Regexp

发现一个特别的地方是Javascript中正则可以使用的选项有igm, 而Ruby是imxo, 见这里的Options

评论

Fork me on GitHub