Class CodeRay::Scanners::Scanner
In: lib/coderay/scanner.rb
Parent: StringScanner

Scanner

The base class for all Scanners.

It is a subclass of Ruby‘s great StringScanner, which makes it easy to access the scanning methods inside.

It is also Enumerable, so you can use it like an Array of Tokens:

  require 'coderay'

  c_scanner = CodeRay::Scanners[:c].new "if (*p == '{') nest++;"

  for text, kind in c_scanner
    puts text if kind == :operator
  end

  # prints: (*==)++;

OK, this is a very simple example :) You can also use map, +any?+, find and even sort_by, if you want.

Methods

Included Modules

Enumerable

Constants

ScanError = Class.new(Exception)   Raised if a Scanner fails while scanning
DEFAULT_OPTIONS = { :stream => false }   The default options for all scanner classes.

Define @default_options for subclasses.

KINDS_NOT_LOC = [:comment, :doctype]

External Aliases

string -> code
  More mnemonic accessor name for the input string.

Public Class methods

[Source]

    # File lib/coderay/scanner.rb, line 86
86:         def file_extension extension = nil
87:           if extension
88:             @file_extension = extension.to_s
89:           else
90:             @file_extension ||= plugin_id.to_s
91:           end
92:         end

Create a new Scanner.

  • code is the input String and is handled by the superclass StringScanner.
  • options is a Hash with Symbols as keys. It is merged with the default options of the class (you can overwrite default options here.)
  • block is the callback for streamed highlighting.

If you set :stream to true in the options, the Scanner uses a TokenStream with the block as callback to handle the tokens.

Else, a Tokens object is used.

[Source]

     # File lib/coderay/scanner.rb, line 120
120:       def initialize code='', options = {}, &block
121:         raise "I am only the basic Scanner class. I can't scan "\
122:           "anything. :( Use my subclasses." if self.class == Scanner
123:         
124:         @options = self.class::DEFAULT_OPTIONS.merge options
125: 
126:         super Scanner.normify(code)
127: 
128:         @tokens = options[:tokens]
129:         if @options[:stream]
130:           warn "warning in CodeRay::Scanner.new: :stream is set, "\
131:             "but no block was given" unless block_given?
132:           raise NotStreamableError, self unless kind_of? Streamable
133:           @tokens ||= TokenStream.new(&block)
134:         else
135:           warn "warning in CodeRay::Scanner.new: Block given, "\
136:             "but :stream is #{@options[:stream]}" if block_given?
137:           @tokens ||= Tokens.new
138:         end
139:         @tokens.scanner = self
140: 
141:         setup
142:       end

[Source]

    # File lib/coderay/scanner.rb, line 69
69:         def normify code
70:           code = code.to_s
71:           if code.respond_to?(:encoding) && (code.encoding.name != 'UTF-8' || !code.valid_encoding?)
72:             code = code.dup
73:             original_encoding = code.encoding
74:             code.force_encoding 'Windows-1252'
75:             unless code.valid_encoding?
76:               code.force_encoding original_encoding
77:               if code.encoding.name == 'UTF-8'
78:                 code.encode! 'UTF-16BE', :invalid => :replace, :undef => :replace, :replace => '?'
79:               end
80:               code.encode! 'UTF-8', :invalid => :replace, :undef => :replace, :replace => '?'
81:             end
82:           end
83:           code.to_unix
84:         end

Returns if the Scanner can be used in streaming mode.

[Source]

    # File lib/coderay/scanner.rb, line 65
65:         def streamable?
66:           is_a? Streamable
67:         end

Public Instance methods

code=(code)

Alias for string=

[Source]

     # File lib/coderay/scanner.rb, line 208
208:       def column pos = self.pos
209:         return 0 if pos <= 0
210:         string = string()
211:         if string.respond_to?(:bytesize) && (defined?(@bin_string) || string.bytesize != string.size)
212:           @bin_string ||= string.dup.force_encoding('binary')
213:           string = @bin_string
214:         end
215:         pos - (string.rindex(?\n, pos) || 0)
216:       end

Traverses the tokens.

[Source]

     # File lib/coderay/scanner.rb, line 193
193:       def each &block
194:         raise ArgumentError,
195:           'Cannot traverse TokenStream.' if @options[:stream]
196:         tokens.each(&block)
197:       end

Returns the Plugin ID for this scanner.

[Source]

     # File lib/coderay/scanner.rb, line 165
165:       def lang
166:         self.class.plugin_id
167:       end

The current line position of the scanner.

Beware, this is implemented inefficiently. It should be used for debugging only.

[Source]

     # File lib/coderay/scanner.rb, line 204
204:       def line
205:         string[0..pos].count("\n") + 1
206:       end

[Source]

     # File lib/coderay/scanner.rb, line 218
218:       def marshal_dump
219:         @options
220:       end

[Source]

     # File lib/coderay/scanner.rb, line 222
222:       def marshal_load options
223:         @options = options
224:       end

[Source]

     # File lib/coderay/scanner.rb, line 144
144:       def reset
145:         super
146:         reset_instance
147:       end

Whether the scanner is in streaming mode.

[Source]

     # File lib/coderay/scanner.rb, line 188
188:       def streaming?
189:         !!@options[:stream]
190:       end

[Source]

     # File lib/coderay/scanner.rb, line 149
149:       def string= code
150:         code = Scanner.normify(code)
151:         if defined?(RUBY_DESCRIPTION) && RUBY_DESCRIPTION['rubinius 1.0.1']
152:           reset_state
153:           @string = code
154:         else
155:           super code
156:         end
157:         reset_instance
158:       end

Scans the code and returns all tokens in a Tokens object.

[Source]

     # File lib/coderay/scanner.rb, line 170
170:       def tokenize new_string=nil, options = {}
171:         options = @options.merge(options)
172:         self.string = new_string if new_string
173:         @cached_tokens =
174:           if @options[:stream]  # :stream must have been set already
175:             reset unless new_string
176:             scan_tokens @tokens, options
177:             @tokens
178:           else
179:             scan_tokens @tokens, options
180:           end
181:       end

[Source]

     # File lib/coderay/scanner.rb, line 183
183:       def tokens
184:         @cached_tokens ||= tokenize
185:       end

Protected Instance methods

Scanner error with additional status information

[Source]

     # File lib/coderay/scanner.rb, line 253
253:       def raise_inspect msg, tokens, state = 'No state given!', ambit = 30
254:         raise ScanError, "\n\n***ERROR in %s: %s (after %d tokens)\n\ntokens:\n%s\n\ncurrent line: %d  column: %d  pos: %d\nmatched: %p  state: %p\nbol? = %p,  eos? = %p\n\nsurrounding code:\n%p  ~~  %p\n\n\n***ERROR***\n\n" % [
255:           File.basename(caller[0]),
256:           msg,
257:           tokens.size,
258:           tokens.last(10).map { |t| t.inspect }.join("\n"),
259:           line, column, pos,
260:           matched, state, bol?, eos?,
261:           string[pos - ambit, ambit],
262:           string[pos, ambit],
263:         ]
264:       end

[Source]

     # File lib/coderay/scanner.rb, line 246
246:       def reset_instance
247:         @tokens.clear unless @options[:keep_tokens]
248:         @cached_tokens = nil
249:         @bin_string = nil if defined? @bin_string
250:       end

This is the central method, and commonly the only one a subclass implements.

Subclasses must implement this method; it must return tokens and must only use Tokens#<< for storing scanned tokens!

[Source]

     # File lib/coderay/scanner.rb, line 241
241:       def scan_tokens tokens, options
242:         raise NotImplementedError,
243:           "#{self.class}#scan_tokens not implemented."
244:       end

Can be implemented by subclasses to do some initialization that has to be done once per instance.

Use reset for initialization that has to be done once per scan.

[Source]

     # File lib/coderay/scanner.rb, line 233
233:       def setup
234:       end

[Validate]