GithubHelp home page GithubHelp logo

seattlerb / flay Goto Github PK

View Code? Open in Web Editor NEW
732.0 15.0 60.0 175 KB

Flay analyzes code for structural similarities. Differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc are all ignored.

Home Page: http://ruby.sadi.st/Flay.html

Ruby 100.00%

flay's Introduction

flay

home

ruby.sadi.st/

code

github.com/seattlerb/flay

rdoc

docs.seattlerb.org/flay/

DESCRIPTION:

Flay analyzes code for structural similarities. Differences in literal values, variable, class, method names, whitespace, programming style, braces vs do/end, etc are all ignored. Making this totally rad.

FEATURES/PROBLEMS:

  • Reports differences at any level of code.

  • Adds a score multiplier to identical nodes.

  • Differences in literal values, variable, class, and method names are ignored.

  • Differences in whitespace, programming style, braces vs do/end, etc are ignored.

  • Works across files.

    • Add the flay-persistent plugin to work across large/many projects.

  • Run –diff to see an N-way diff of the code.

  • Provides conservative (default) and –liberal pruning options.

  • Provides –fuzzy duplication detection.

  • Language independent: Plugin system allows other languages to be flayed.

    • Ships with .rb and .erb.

    • javascript and others will be available separately.

  • Includes FlayTask for Rakefiles.

  • Uses path_expander, so you can use:

    • dir_arg – expand a directory automatically

    • @file_of_args – persist arguments in a file

    • -path_to_subtract – ignore intersecting subsets of files/directories

  • Skips files matched via patterns in .flayignore (subset format of .gitignore).

  • Totally rad.

KNOWN EXTENSIONS:

  • flay-actionpack

    Use Rails ERB handler.

  • flay-js

    Process JavaScript files.

  • flay-haml

    Flay your HAML source.

  • flay-persistence

    Persist results across runs. Great for multi-project analysis.

TODO:

  • Editor integration (emacs, textmate, other contributions welcome).

  • Vim integration started (github.com/prophittcorey/vim-flay)

    - Flays the current file on save, load, or on command

SYNOPSIS:

% flay -v --diff ~/Work/svn/ruby/ruby_1_8/lib/cgi.rb
Processing /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb...

Matches found in :defn (mass = 184)
  A: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:1470
  B: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:1925

A: def checkbox_group(name = "", *values)
B: def radio_group(name = "", *values)
     if name.kind_of?(Hash) then
       values = name["VALUES"]
       name = name["NAME"]
     end
     values.collect do |value|
       if value.kind_of?(String) then
A:       (checkbox(name, value) + value)
B:       (radio_button(name, value) + value)
       else
         if (value[(value.size - 1)] == true) then
A:         (checkbox(name, value[0], true) + value[(value.size - 2)])
B:         (radio_button(name, value[0], true) + value[(value.size - 2)])
         else
A:         (checkbox(name, value[0]) + value[(value.size - 1)])
B:         (radio_button(name, value[0]) + value[(value.size - 1)])
         end
       end
     end.to_s
   end

IDENTICAL Matches found in :for (mass*2 = 144)
  A: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:2160
  B: /Users/ryan/Work/svn/ruby/ruby_1_8/lib/cgi.rb:2217

   for element in ["HTML", "BODY", "P", "DT", "DD", "LI", "OPTION", "THEAD", "TFOOT", "TBODY", "COLGROUP", "TR", "TH", "TD", "HEAD"] do
     methods = (methods + (("          def #{element.downcase}(attributes = {})\n" + nO_element_def(element)) + "          end\n"))
   end
...

REQUIREMENTS:

  • ruby_parser

  • sexp_processor

  • path_expander

  • ruby2ruby – soft dependency: only if you want to use –diff

INSTALL:

  • sudo gem install flay

LICENSE:

(The MIT License)

Copyright © Ryan Davis, Seattle.rb

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

flay's People

Contributors

zenspider avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flay's Issues

Collaboration with Codacy

Hi there, João Machado from Codacy here.

We have a Flay integration, so that everyone that uses our website can get Ruby duplication analysis. You can check it at https://github.com/codacy/codacy-duplication-flay

Would you guys be interested on mentioning us as a out-of-box solution to run Flay on your code? You can also add your repo at Codacy to check it out, we are free for open source projects.

Rake task ignores `dirs`

The attribute dirs seems to be ignored by rake FlayTask.

In FlayTask#define method, dirs should be passed as argument when invoking Flay.run (see flay_task.rb:48), which has the optional argument args that by default is set to ARGV (possibly not useful at all in this context).

Plugins not loaded when flay run programmatically

When running Flay programmtically, .erb files do not get processed.

It appears that Flay.load_plugins doesn't get called via this code path.

[~/Dropbox/WizeCommerce/apollo]$ irb        
1.9.2p290 :001 > require 'flay'
 => true 
1.9.2p290 :002 > f = Flay.new
 => #<Flay:0x007fbda230a860 @option={:diff=>false, :mass=>16, :summary=>false, :verbose=>false}, @hashes={}, @identical={}, @masses={}, @total=0, @mass_threshold=16> 
1.9.2p290 :003 > f.respond_to? 'process_erb'
 => false 
1.9.2p290 :004 > Flay.load_plugins
 => ["erb"] 
1.9.2p290 :005 > f.respond_to? 'process_erb'
 => true 
1.9.2p290 :006 > 

Add a data structure that contains the data generated by flay.report

Given

Total score (lower is better) = 494

1) IDENTICAL code found in :defn (mass*2 = 196)
  app/models/template_demo.rb:61
  app/models/template_purchase.rb:58

2) IDENTICAL code found in :defn (mass*2 = 144)
  app/models/template_demo.rb:16
  app/models/template_purchase.rb:25

3) Similar code found in :resbody (mass = 58)
  app/models/template.rb:80
  app/models/template.rb:125

4) Similar code found in :defn (mass = 50)
  app/controllers/my_templates_git_controller.rb:4
  app/controllers/my_templates_git_controller.rb:10

5) Similar code found in :call (mass = 46)
  app/inputs/checkbox_list_input.rb:25
  app/inputs/screenshots_input.rb:22

metric_fu, for example, parses this text report to yield (admittedly, not the most rigorous analysis)

{:total_score=>"494",
 :matches=>
  [{:reason=>"1) IDENTICAL code found in :defn (mass*2 = 196)",
    :matches=>
     [{:name=>"app/models/template_demo.rb", :line=>"61"},
      {:name=>"app/models/template_purchase.rb", :line=>"58"}]},
   {:reason=>"2) IDENTICAL code found in :defn (mass*2 = 144)",
    :matches=>
     [{:name=>"app/models/template_demo.rb", :line=>"16"},
      {:name=>"app/models/template_purchase.rb", :line=>"25"}]},
   {:reason=>"3) Similar code found in :resbody (mass = 58)",
    :matches=>
     [{:name=>"app/models/template.rb", :line=>"80"},
      {:name=>"app/models/template.rb", :line=>"125"}]},
   {:reason=>"4) Similar code found in :defn (mass = 50)",
    :matches=>
     [{:name=>"app/controllers/my_templates_git_controller.rb", :line=>"4"},
      {:name=>"app/controllers/my_templates_git_controller.rb", :line=>"10"}]},
   {:reason=>"5) Similar code found in :call (mass = 46)",
    :matches=>
     [{:name=>"app/inputs/checkbox_list_input.rb", :line=>"25"},
      {:name=>"app/inputs/screenshots_input.rb", :line=>"22"}]}]}

The problem, is that it is not easily possible to reproduce the results of this report without either reproducing most of the code in 'report' or changing the meaning of puts and warn on the flay instance.

Now, if I were to run ruby -rpp -rflay -e "flay = Flay.new(Flay.default_options); files = Flay.expand_dirs_to_files(%w[app lib]); flay.process(*files); flay.analyze; pp [flay.total, flay.summary, flay.masses, flay.identical]" I get

[494,
 {"app/controllers/my_templates_git_controller.rb"=>50.0,
  "app/inputs/checkbox_list_input.rb"=>23.0,
  "app/inputs/screenshots_input.rb"=>23.0,
  "app/models/template.rb"=>58.0,
  "app/models/template_demo.rb"=>170.0,
  "app/models/template_purchase.rb"=>170.0},
 {-3749969949219652095=>50,
  1606873373262129863=>46,
  1034993600789789894=>58,
  -4553549444874076191=>144,
  3143048742042231397=>196},
 {-3749969949219652095=>false,
  1606873373262129863=>false,
  1034993600789789894=>false,
  -4553549444874076191=>true,
  3143048742042231397=>true}]

Per discussion in

[ASK] how the DuplicateCode works

Hello guys, i've a question how the DuplicateCode works, i've this line of code thats read as DuplicateCode:

job_positions_controller.rb
job_position = JobPosition.new

job_levels_controller.rb
job_level = JobLevel.new

i dont really get on how the rubycritic scan the duplicatecode, can anyone give me a hint? tq in advance

NameError: uninitialized constant FlayTask::Flay

Dangit GitHub for creating this issue before I had a chance to fill everything out.

I just updated to the latest version of Flay (2.7.0 -> 2.8.0) and now my Rake task is failing.

Rake version: 11.1.2.

My Rakefile looks like this:

[...]
if Rails.env.test? || Rails.env.development?
  require 'flay_task'
  [...]
  FlayTask.new
  [...]
  task default: [:test, :flay]
end

After this Gem version bump, this now fails due to the error:

rake aborted!
NameError: uninitialized constant FlayTask::Flay
/var/lib/gems/2.3.0/gems/flay-2.8.0/lib/flay_task.rb:47:in `block in define'
Tasks: TOP => flay
(See full trace by running task with --trace)

It appears to be related to the commit: 4781d22#diff-3709294f47edd3f57c91adb09fa9953eL47

In this commit, the require 'flay' was removed causing Rakefiles that only depend on the flay_task to break, as mine did.

Is this expected behavior?

There is a workaround to just require 'flay' in my Rakefile, but this seems like it would break all existing consumers of Flay if they use it directly from their Rakefile, like I did.

flog flay score 1007.9

I ran flog over flay:

$ flog .
  1007.9: flog total
    20.2: flog/method average

   116.9: Flay#report                      ./lib/flay.rb:417
    89.0: TestSexp#test_prune_liberal      ./test/test_flay.rb:126
    69.0: TestSexp#test_prune              ./test/test_flay.rb:90
    54.2: Flay#process_fuzzy               ./lib/flay.rb:269
    52.4: TestSexp#test_all_structural_subhashes ./test/test_flay.rb:41
    52.0: Flay::parse_options              ./lib/flay.rb:39
    44.1: Flay#n_way_diff                  ./lib/flay.rb:366
    36.0: Flay#prune_liberally             ./lib/flay.rb:331
    34.7: FlayGauntlet#display_report      ./lib/gauntlet_flay.rb:38
    33.4: Flay#process                     ./lib/flay.rb:175
    29.3: main#none

Not flagging duplicate RSpec tests

I've got some inherited code that's some deeply nested, absolutely gigantic RSpec suites. For Reasons(tm), part way through some work involving the files, I 1) needed to duplicate a bunch of tests, and 2) needed to scan a PR for duplicated code that made it where it shouldn't. Anyway! Seems like Flay isn't picking up on it.

Example:

                    it 'calls ServiceWidget.complete_thing' do
                      expect(
                        mock_service_widget
                      ).to receive(:complete_thing).and_call_original
                      request
                    end
                    
                    it 'calls ServiceWidget.complete_thing' do
                      expect(
                        mock_service_widget
                      ).to receive(:complete_thing).and_call_original
                      request
                    end

Flay isn't reporting this duplicate, although it is reporting other duplicate code in this suite. It doesn't look like it's reporting this duplicated code as part of a larger block of duplicated code, either, which is very much A Thing(tm) in this file (alas).

It's very, very possible that the size (and other things) about these suites are breaking Flay, although it is successfully reporting other places with duplicate and similar code.

flay flay score 136

I ran flay over flay:

$ flay .
Total score (lower is better) = 136

1) IDENTICAL code found in :array (mass*2 = 100)
  ./test/test_flay.rb:119
  ./test/test_flay.rb:158

2) Similar code found in :iter (mass = 36)
  ./lib/flay.rb:65
  ./lib/flay.rb:86

Parse error

I get the following error when running flay on my project:

app/views/docs/new.html.erb:17 :: parse error on value ")" (tRPAREN)
skipping app/views/docs/new.html.erb

The code it references is:

<%= f.hidden_field :font_size %>

Dunno if it has any relevance, but it's directly inside a form tag:

<%= form_for @doc, :html => {:multipart => true, :class => "form-horizontal"} do |f| %>

Parsing issue with heredocs.

I'm getting this error:

  test.rb:1 :: parse error on value "<<" (tLSHFT)
  skipping test.rb

From this fragment:

<<~GRAPHQL
        mutation create_thing($params: NewThingParams!) {
          users_create_thing(params: $params) { user { id } }
        }
GRAPHQL

Using flay 2.10.0 and ruby_parser 3.9.0 under Ruby 2.4.1.

.flayignore

Would be nice to load .flayignore, like .gitignore files, to allow flay users to specify certain files and directories to ignore.

Invoking Flay During Code-Review/Unit Tests

This is a great tool!

I wanted to ask about building this more sustainably. I currently do this manually (with regexes 😢) on Flay's output.

Situation

  • I want to run Flay before submitting a PR to see if I can fixup the code I'm changing. I'm trying to find instances where if it was copy/pasted in the past. I'd also want it to call me out for copy/pasting code.
  • Because I'm working in a legacy codebase where some code has been duplicated, I only care about the files in the diff I'm making. Fixing everything would be exhausting.
  • In bash, I use FILES=$(git diff --name-only HEAD $(git merge-base HEAD origin/master) | grep '[.]rb') to find a list of files that have changed.
  • Then, I invoke flay, then search for complaints where one duplications exist in FILES.

Request

I'd expect to run flay in some way like flay --only-care-about=FILES lib/*.rb to only report issues where the issues include FILES.

Missing nodes that duplicate a subset of a larger duplication

Hi - if we have the following structures:

contained = s(:a,s(:b,s(:c)))
container = s(:d,contained)

program = s(:program,contained,container,container)

at the moment the prune algorithm will leave us blind to the duplicated code in contained. Prune will remove contained, so we only see the dupe between 2 the instances of contained inside container. We still have contained 3 times though.

Just wanted to check my reasoning that this is a problem.

I think it could be fixed by checking the length of hashes[contained.structural_hash] is equal to that of the containing node we're above to delete it in favour of. If it were different, flay could note that we have three sets of contained, and two of container, and report an additional duplication with a smaller cost for the extra contained.

CLI return value

I would like to have a non-zero exit code when using bundle exec flay to be able to use it in a CI and have a meaningful exit status.

I started to look at the code to see how can I contribute.

It is usually good to have a exit code giving more informations than 0 is OK, 1 is an error.

I saw 3 options here:

  • Exiting with self.total

    • this is not possible because bash exit status has to be between 0-255
    • score is quickly above 255, leading to overflow, exit code is different from total score and we can have an exit code equal to 0 with a non-zero total score
  • Exiting with data.size or self.summary.size

    • this is not possible because bash exit status has to be between 0-255, we could have more than 255 cases detected by the gem
    • both values are usually different, meaning a different exit code just by adding the summary option, it's losing its meaning
  • Exiting with 0 if self.total == 0, 1 otherwise

I would like to have your thoughts about this before proposing a PR, maybe another interesting info to give as exit status ?

-# and zshell

Does not work, # in zshell is used for pattern removal

Example of how to use .flayignore

Is there an example of how to use .flayignore? I'm trying to ignore GraphQL type files because the nature of their typing requires a lot of duplication. I created a .flayignore in the root level that contains the following lines:

# Ignore all GraphQL type files
/app/graphql/types/*

However, that doesn't work - it's still doing a duplicate code check in the files. What do I need to change here?

Parse error on named argument in ruby 2.0

$ cat test.rb 
class Test
  def initialize(variable, named_variable: 1)
  end
end

$ flay test.rb
  test.rb:2 :: parse error on value ":" (tCOLON)
  skipping test.rb
Total score (lower is better) = 0

parse error on value ":" (tCOLON)

When I run flay, I receive the following parse errors. I don't see anything wrong with the erb files. I've also seen a parse error on value ";" (tSEMI) in another file.

$ flay --diff lib/ config/ app/
app/views/devise/confirmations/new.html.erb:4 :: parse error on value ":" (tCOLON)
skipping app/views/devise/confirmations/new.html.erb
app/views/devise/passwords/edit.html.erb:4 :: parse error on value ":" (tCOLON)
skipping app/views/devise/passwords/edit.html.erb
app/views/devise/passwords/new.html.erb:4 :: parse error on value ":" (tCOLON)
skipping app/views/devise/passwords/new.html.erb

$ cat app/views/devise/confirmations/new.html.erb
<h2>Resend confirmation instructions</h2>

<%= form_for(resource, as: resource_name, url: confirmation_path(resource_name), html: { method: :post }) do |f| %>
  <%= devise_error_messages! %>

  <div><%= f.label :email %><br />
  <%= f.email_field :email, autofocus: true %></div>

  <div><%= f.submit "Resend confirmation instructions" %></div>
<% end %>

<%= render "devise/shared/links" %>


$ cat app/views/devise/passwords/edit.html.erb
<h2>Change your password</h2>

<%= form_for(resource, as: resource_name, url: password_path(resource_name), html: { method: :put }) do |f| %>
  <%= devise_error_messages! %>
  <%= f.hidden_field :reset_password_token %>

  <div><%= f.label :password, "New password" %><br />
    <%= f.password_field :password, autofocus: true, autocomplete: "off" %></div>

  <div><%= f.label :password_confirmation, "Confirm new password" %><br />
    <%= f.password_field :password_confirmation, autocomplete: "off" %></div>

  <div><%= f.submit "Change my password" %></div>
<% end %>

<%= render "devise/shared/links" %>


$ cat app/views/devise/passwords/new.html.erb
<h2>Forgot your password?</h2>

<%= form_for(resource, as: resource_name, url: password_path(resource_name), html: { method: :post }) do |f| %>
  <%= devise_error_messages! %>

  <div><%= f.label :email %><br />
  <%= f.email_field :email, autofocus: true %></div>

  <div><%= f.submit "Send me reset password instructions" %></div>
<% end %>

<%= render "devise/shared/links" %>

parse error on value "," (tCOMMA)

I am not sure,probably this issue should be addressed to ruby_parser gem.

Flay shows the next message:

  parse error on value "," (tCOMMA)
  skipping ./lib/ach/component.rb

when it tries to parse code like this:

define_mehtod(method_name) do |*args, &block|
  # method body
end

Mass flag not returning expected results

When I run flay --mass 100 app/models I get

Total score (lower is better) = 0

When I run flay --mass 20 app/models I get

Total score (lower is better) = 1397


1) IDENTICAL code found in :call (mass*4 = 560)
  app/models/klass_one.rb:51
  app/models/klass_two.rb:270
  app/models/klass_three.rb:59
  app/models/klass_four.rb:62

2) IDENTICAL code found in :call (mass*3 = 369)
  app/models/klass_two.rb:276
  app/models/klass_three.rb:65
  app/models/klass_four.rb:68

3) IDENTICAL code found in :defn (mass*2 = 180)
  app/models/klass_two.rb:148
  app/models/klass_five.rb:27

4) IDENTICAL code found in :defs (mass*2 = 104)
  app/models/klass_two.rb:13
  app/models/klass_six.rb:5

5) Similar code found in :class (mass = 74)
  app/models/klass_seven.rb:3
  app/models/klass_eight.rb:3

6) Similar code found in :call (mass = 70)
  app/models/klass_one.rb:57
  app/models/klass_one.rb:63

7) Similar code found in :defn (mass = 40)
  app/models/klass_two.rb:241
  app/models/klass_nine.rb:140

Am I misunderstanding what mass is supposed to return?

cruby 1.9.3p385 running on osx mountain lion under rvm

a mass of 40 returns only numbers 2 and 3 of the above, but not number 1, which also perplexes me

parse error on value ")" (tRPAREN) on Rails ERb files

Running flay (as part of metrics_fu), I get errors parsing ERb files some time around updating to Ruby 3.0.

parse error on value ")" (tRPAREN)
skipping app/views/things/_form.html.erb

It seems like a minor issue because it continues to run and finishes without an outright failure, but we have a lot of views so the log gets flooded heavily with these errors.

I have tried also including flay-actionpack in case that fixed the issue, but no, I still get the errors despite having that gem installed as well.

Inconsistent use of mass.

It seems that when you specify a threshhold for mass to a new Flay object, it's being as the threshhold for ths single instance of the duplicate.

for instance give the following file (test.rb):

class Test
  def start_date_valid?
    ticket_data['start_date'].blank? || Time.parse("#{ticket_data['start_date']}#{@ticket_set.serialized_data['pingout_timestamp']}") < Time.now
  end

  def end_date_valid?
    ticket_data['end_date'].blank? || Time.parse("#{ticket_data['end_date']}#{@ticket_set.serialized_data['pingout_timestamp']}") > Time.now
  end
end

and then run
flay = Flay.new({ :verbose => true, :mass => FLAY_THRESHHOLD})
flay.process(*Flay.expand_dirs_to_files(%w{test.rb}))
flay.report

when FLAY_THRESHHOLD is 23, you get
Total score (lower is better) = 0

when FLAY_THRESHOLD is 22, you get

Total score (lower is better) = 44


1) Similar code found in :defn (mass = 44)
  test.rb:3
  test.rb:7

So you have a duplication with a total mass of 44, yet a mass threshhold of 23 suppresses this duplication.

I'm assuming that each of the 2 lines have a mass of 22, which is what the threshhold is being applied to, not the mass that is being calculated and reported.

flay doesn't recognize the new symbol array literal

Ruby 2.0.0 Preview 1 and 2 have support for a new symbol array literal. For example, this is valid Ruby 2.0 syntax:

  stages = %i[whitespace correct_grammar contractions dedupe_punct abbr
              remove_vowels dedupe_consonants apostrophes]

However, flay doesn't like it. It complains:

Bad %string type. Expected [QqWwxrs], found 'i'.. near line 13: "whitespace correct_grammar contractions dedupe_punct abbr"

and then completely skips the entire file.

zsh: command not found: flay

On OSX 10.11.3:

$ sudo gem install flay
Fetching: flay-2.8.0.gem (100%)
Successfully installed flay-2.8.0
Parsing documentation for flay-2.8.0
Installing ri documentation for flay-2.8.0
Done installing documentation for flay after 0 seconds
1 gem installed
$ flay
zsh: command not found: flay

Flay not detecting duplicate code

Hello,

I have ran Flay against some code that is obviously duped such as the test case:

    ##
    # I am a dog.
    class Dog
      def x
        return "Hello"
      end
    end
    ##
    # I
    # am
    # a
    # cat.
    class Cat
      def y
        return "Hello"
      end
    end

However, I can't seem to get flay to report the duplication. I only receive Total score (lower is better) = 0

output report in a machine-readable format

As far as I've seen, is not possible to output the result in a machine parseable format yet.

For instance, running flay -# -d lib/*.rb in this project outputs:

$ flay -# -d lib/*.rb
Total score (lower is better) = 72

Similar code found in :iter (mass = 36)
  A: lib/flay.rb:74
  B: lib/flay.rb:99

A: opts.on("-m", "--mass MASS", Integer, "Sets mass threshold (default = #{options[:mass]})") do |m|
B: opts.on("-t", "--timeout TIME", Integer, "Set the timeout. (default = #{options[:timeout]})") do |t|
A:   options[:mass] = m.to_i
B:   options[:timeout] = t.to_i
   end

Similar code found in :defn (mass = 36)
  A: lib/flay_erb.rb:28
  B: lib/flay_erb.rb:36

A: def add_expr_literal(src, code)
B: def add_expr_escaped(src, code)
     if code.=~(BLOCK_EXPR) then
A:     ((src << "@output_buffer.append= ") << code)
B:     ((src << "@output_buffer.safe_append= ") << code)
     else
A:     (((src << "@output_buffer.append=(") << code) << ");")
B:     (((src << "@output_buffer.safe_append=(") << code) << ");")
     end
   end

Something like this would be useful for parsing (JSON example, XML or YAML would also be valid):

{
    "total": 72,
    "clones": [
        {
            "match": ":iter",
            "mass": 36,
            "A": {
                "filename": "lib/flay.rb",
                "line": 74
            },
            "B": {
                "filename": "lib/flay.rb",
                "line": 99
            },
            "lines": [
                {
                    "source": "A",
                    "content": "opts.on("-m", "--mass MASS", Integer, "Sets mass threshold (default = #{options[:mass]})") do |m|"
                },
                {
                    "source": "B",
                    "content": "opts.on("-t", "--timeout TIME", Integer, "Set the timeout. (default = #{options[:timeout]})") do |t|"
                },
                {
                    "source": "A",
                    "content": "   options[:mass] = m.to_i"
                },
                {
                    "source": "B",
                    "content": "   options[:timeout] = t.to_i"
                },
                {
                    "source": "Common",
                    "content": "   end"
                }
            ]
        },
        {
            "match": ":defn",
            "mass": 36,
            "A": {
                "filename": "lib/flay_erb",
                "line": 28
            },
            "B": {
                "filename": "lib/flay_erb",
                "line": 36
            },
            "lines": [
                {
                    "source": "A",
                    "content": "def add_expr_literal(src, code)"
                },
                {
                    "source": "B",
                    "content": "def add_expr_escaped(src, code)"
                },
                {
                    "source": "Common",
                    "content": "    if code.=~(BLOCK_EXPR) then"
                },
                {
                    "source": "A",
                    "content": "     ((src << "@output_buffer.append= ") << code)"
                },
                {
                    "source": "B",
                    "content": "     ((src << "@output_buffer.safe_append= ") << code)"
                },
                {
                    "source": "Common",
                    "content": "     else"
                },
                {
                    "source": "A",
                    "content": "     (((src << "@output_buffer.append=(") << code) << ");")"
                },
                {
                    "source": "B",
                    "content": "     (((src << "@output_buffer.safe_append=(") << code) << ");")"
                },
                {
                    "source": "Common",
                    "content": "     end"
                },
                {
                    "source": "Common",
                    "content": "   end"
                }
            ]
        }
    ]
}

Would anyone oppose to this? If no one opposes I'm willing to start the feature and do the PR

Overlapping nodes are reported as similar

code:

case a
when 1:
  b = 1
when 2:
  b = 1
when 3:
  b = 1
else
  b = 1
end

Diff reported by flay:

A: when 1:
B: when 2:
C: when 3:
     b = 1
A: when 2:
B: when 3:
C: else
     b = 1
A: when 3:
B: else
C: end
A:   b = 1 
B:   b = 1
A: else
B: end
A:   b = 1
A: end

N-way diff output can contain preceding comment lines not in duplicated block.

Ruby file #1:

# Comment

$:.push("#{File.dirname(__FILE__)}/../VolumeManager")

# Comment 2

    class ConsoleFormatter < Log4r::Formatter
        def format(event)
            (event.data.kind_of?(String) ? event.data : event.data.inspect) + "\n"
        end
    end

Ruby file #2:

# Comment from test3
$:.push("#{File.dirname(__FILE__)}/..")

class ConsoleFormatter < Log4r::Formatter
    def format(event)
        (event.data.kind_of?(String) ? event.data : event.data.inspect) + "\n"
    end
end

Output of "flay --diff file1 file2:

[root@gallen-rhel6 flay]# flay --diff test2.rb test3.rb 
Total score (lower is better) = 68

1) IDENTICAL code found in :class (mass*2 = 68)
  A: test2.rb:7
  B: test3.rb:4

A: # Comment
B: # Comment from test3
B: class ConsoleFormatter < Log4r::Formatter
A: # Comment 2
B:   def format(event)
B:     ((event.data.kind_of?(String) ? (event.data) : (event.data.inspect)) + "\n")
A: class ConsoleFormatter < Log4r::Formatter
B:   end
A:   def format(event)
B: end
A:     ((event.data.kind_of?(String) ? (event.data) : (event.data.inspect)) + "\n")
A:   end
A: end

These may not in fact be legal ruby files. I hacked them up from my sources to create a small test case. But flay runs on them just fine, and I get the same output with the real sources.

Note the comment lines in the diff output. None are part of the identical code.

Why is this happening?

reek warnings

I ran reek over flay:

$ reek lib/
lib//flay.rb -- 95 warnings:
  Array#delete_eql contains iterators nested 2 deep (NestedIterators)
  Array#delete_eql has the variable name 'o1' (UncommunicativeVariableName)
  Array#delete_eql has the variable name 'o2' (UncommunicativeVariableName)
  Flay declares the class variable @@plugins (ClassVariable)
  Flay has no descriptive comment (IrresponsibleModule)
  Flay#analyze contains iterators nested 2 deep (NestedIterators)
  Flay#analyze has the variable name 'n' (UncommunicativeVariableName)
  Flay#initialize has the variable name 'h' (UncommunicativeVariableName)
  Flay#initialize has the variable name 'k' (UncommunicativeVariableName)
  Flay#n_way_diff calls s.scan(/^.*/) twice (DuplicateMethodCall)
  Flay#n_way_diff contains iterators nested 2 deep (NestedIterators)
  Flay#n_way_diff doesn't depend on instance state (UtilityFunction)
  Flay#n_way_diff has approx 16 statements (TooManyStatements)
  Flay#n_way_diff has the variable name 'c' (UncommunicativeVariableName)
  Flay#n_way_diff has the variable name 'i' (UncommunicativeVariableName)
  Flay#n_way_diff has the variable name 'l' (UncommunicativeVariableName)
  Flay#n_way_diff has the variable name 'o' (UncommunicativeVariableName)
  Flay#n_way_diff has the variable name 's' (UncommunicativeVariableName)
  Flay#n_way_diff refers to s more than self (FeatureEnvy)
  Flay#process calls e.message twice (DuplicateMethodCall)
  Flay#process has approx 14 statements (TooManyStatements)
  Flay#process has the variable name 'e' (UncommunicativeVariableName)
  Flay#process performs a nil-check. (NilCheck)
  Flay#process_fuzzy calls code.size twice (DuplicateMethodCall)
  Flay#process_fuzzy calls new_node.structural_hash twice (DuplicateMethodCall)
  Flay#process_fuzzy calls node.size twice (DuplicateMethodCall)
  Flay#process_fuzzy calls self.hashes twice (DuplicateMethodCall)
  Flay#process_fuzzy calls self.hashes[new_node.structural_hash] twice (DuplicateMethodCall)
  Flay#process_fuzzy contains iterators nested 3 deep (NestedIterators)
  Flay#process_fuzzy has approx 11 statements (TooManyStatements)
  Flay#process_fuzzy has the variable name 'n' (UncommunicativeVariableName)
  Flay#process_sexp calls option twice (DuplicateMethodCall)
  Flay#process_sexp calls option[:fuzzy] twice (DuplicateMethodCall)
  Flay#process_sexp contains iterators nested 2 deep (NestedIterators)
  Flay#prune calls self.hashes twice (DuplicateMethodCall)
  Flay#prune calls self.hashes.delete_if twice (DuplicateMethodCall)
  Flay#prune_conservatively calls self.hashes twice (DuplicateMethodCall)
  Flay#prune_conservatively contains iterators nested 2 deep (NestedIterators)
  Flay#prune_conservatively has the variable name 'h' (UncommunicativeVariableName)
  Flay#prune_liberally calls self.hashes 3 times (DuplicateMethodCall)
  Flay#prune_liberally calls self.masses twice (DuplicateMethodCall)
  Flay#prune_liberally contains iterators nested 3 deep (NestedIterators)
  Flay#prune_liberally has approx 12 statements (TooManyStatements)
  Flay#prune_liberally has the variable name 'h' (UncommunicativeVariableName)
  Flay#prune_liberally has the variable name 'k' (UncommunicativeVariableName)
  Flay#prune_liberally has the variable name 'v' (UncommunicativeVariableName)
  Flay#report calls hashes 4 times (DuplicateMethodCall)
  Flay#report calls hashes[h] 3 times (DuplicateMethodCall)
  Flay#report calls hashes[h].first 3 times (DuplicateMethodCall)
  Flay#report calls node.first twice (DuplicateMethodCall)
  Flay#report calls nodes.first twice (DuplicateMethodCall)
  Flay#report calls option 4 times (DuplicateMethodCall)
  Flay#report calls option[:diff] twice (DuplicateMethodCall)
  Flay#report calls puts 3 times (DuplicateMethodCall)
  Flay#report calls x.file 3 times (DuplicateMethodCall)
  Flay#report calls x.line 3 times (DuplicateMethodCall)
  Flay#report calls x.modified? twice (DuplicateMethodCall)
  Flay#report contains iterators nested 2 deep (NestedIterators)
  Flay#report has approx 27 statements (TooManyStatements)
  Flay#report has the variable name 'c' (UncommunicativeVariableName)
  Flay#report has the variable name 'h' (UncommunicativeVariableName)
  Flay#report has the variable name 'i' (UncommunicativeVariableName)
  Flay#report has the variable name 'm' (UncommunicativeVariableName)
  Flay#report has the variable name 'n' (UncommunicativeVariableName)
  Flay#report has the variable name 's' (UncommunicativeVariableName)
  Flay#report has the variable name 'v' (UncommunicativeVariableName)
  Flay#report has the variable name 'x' (UncommunicativeVariableName)
  Flay#report is controlled by argument prune (ControlParameter)
  Flay#self.expand_dirs_to_files has the variable name 'p' (UncommunicativeVariableName)
  Flay#self.load_plugins has approx 9 statements (TooManyStatements)
  Flay#self.load_plugins has the variable name 'e' (UncommunicativeVariableName)
  Flay#self.load_plugins has the variable name 'p' (UncommunicativeVariableName)
  Flay#self.parse_options calls opts.separator("") 3 times (DuplicateMethodCall)
  Flay#self.parse_options contains iterators nested 2 deep (NestedIterators)
  Flay#self.parse_options has approx 27 statements (TooManyStatements)
  Flay#self.parse_options has the variable name 'e' (UncommunicativeVariableName)
  Flay#self.parse_options has the variable name 'm' (UncommunicativeVariableName)
  Flay#self.parse_options has the variable name 'n' (UncommunicativeVariableName)
  Flay#self.parse_options has the variable name 't' (UncommunicativeVariableName)
  Flay#summary contains iterators nested 2 deep (NestedIterators)
  Flay#summary has approx 6 statements (TooManyStatements)
  Flay#update_masses calls masses 4 times (DuplicateMethodCall)
  Flay#update_masses calls nodes.size twice (DuplicateMethodCall)
  Flay#update_masses has approx 6 statements (TooManyStatements)
  Sexp has no descriptive comment (IrresponsibleModule)
  Sexp#+ has the parameter name 'o' (UncommunicativeParameterName)
  Sexp#[] has the parameter name 'a' (UncommunicativeParameterName)
  Sexp#[] has the variable name 's' (UncommunicativeVariableName)
  Sexp#[] has unused parameter 'a' (UnusedParameters)
  Sexp#initialize_copy has the parameter name 'o' (UncommunicativeParameterName)
  Sexp#initialize_copy has the variable name 's' (UncommunicativeVariableName)
  Sexp#initialize_copy refers to o more than self (FeatureEnvy)
  Sexp#initialize_copy refers to s more than self (FeatureEnvy)
  Sexp#split_at has the parameter name 'n' (UncommunicativeParameterName)
  String has no descriptive comment (IrresponsibleModule)
lib//flay_erb.rb -- 2 warnings:
  Flay has no descriptive comment (IrresponsibleModule)
  Flay#process_erb has the variable name 'e' (UncommunicativeVariableName)
lib//flay_task.rb -- 6 warnings:
  FlayTask has no descriptive comment (IrresponsibleModule)
  FlayTask#define calls dirs twice (DuplicateMethodCall)
  FlayTask#define calls flay.total twice (DuplicateMethodCall)
  FlayTask#define calls threshold twice (DuplicateMethodCall)
  FlayTask#define has approx 8 statements (TooManyStatements)
  FlayTask#initialize has the variable name 'f' (UncommunicativeVariableName)
lib//gauntlet_flay.rb -- 7 warnings:
  FlayGauntlet#display_report has approx 13 statements (TooManyStatements)
  FlayGauntlet#display_report has the variable name 'i' (UncommunicativeVariableName)
  FlayGauntlet#score_for doesn't depend on instance state (UtilityFunction)
  FlayGauntlet#score_for has approx 6 statements (TooManyStatements)
  FlayGauntlet#score_for has the variable name 'f' (UncommunicativeVariableName)
  FlayGauntlet#score_for has unused parameter 'dir' (UnusedParameters)
  FlayGauntlet#score_for refers to flay more than self (FeatureEnvy)

Maybe ignore some of these warnings with a .reek file?

Number of lines in a duplication?

This may be more of a question than an issue....

The output from flay looks like this:

  1. IDENTICAL code found in :defn (mass*2 = 64)
    ./event_catcher_testing.rb:44
    ./event_catcher_production.rb:33

This tells me where to find the identical code.

But from this output how can I determine how many lines of code are in the duplication?

Define pattern to ignore

Hi. Is there a way to define patterns to ignore? For example, some repositories have in their code things like

    def method_name
        fail 'not implemented'
    end

This causes CodeClimate to report a big mass for these abstract methods. Thanks.

rake task errors on Rake::TaskLib constant

With a simple rakefile:

require 'flay_task'
FlayTask.new

I get:

$ rake flay
rake aborted!
NameError: uninitialized constant Rake::TaskLib
<project>/Rakefile:4:in `require'
<project>/Rakefile:4:in `<top (required)>'
(See full trace by running task with --trace)

The flay rake task ought to be requiring rake/tasklib since it uses that class.

No secondary sort of files with same score

If files have the same score then default sort algorithm on different machine architectures may output different results.

Found when comparing results of flay from Linux machine to M1 Macbook of same project

Example output, machine 1:

33:00: c.rb
33:00: b.rb
33:00 a.rb

Machine 2:

33.00: b.rb
33.00: a.rb
33:00: c.rb

Fix:

Line 487 of ./lib/flay.rb is
self.summary.sort_by { |_,v| -v }.each do |file, score|

Replace with

  self.summary.sort_by { |f,v| [-v, f] }.each do |file, score|

This will make sure flay is sorted by score then alphabetical by file name.

Should update specification to allow latest version of ruby_parser

ruby_parser 2.0.6 barfs on Ruby 1.9's new hash syntax.

parse error on value ":" (tCOLON)
/usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:349:in on_error' /usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in_racc_do_parse_c'
/usr/local/rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/racc/parser.rb:99:in do_parse' /usr/local/rvm/gems/ruby-1.9.2-p290@trunk_gems/gems/ruby_parser-2.0.6/lib/ruby_parser_extras.rb:749:inprocess'
/usr/local/rvm/gems/ruby-1.9.2-p290@trunk_gems/gems/flog-2.5.3/lib/flog.rb:241:in `block in flog'

-# and zshell

Does not work, # in zshell is used for pattern removal

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.