One of TopicEmbed.import spec fails with Ruby 3

I’d like to get some help about failing TopicEmbed.import spec with Ruby 3.

Steps to reproduce

cd discourse
bundle exec rspec ./spec/models/topic_embed_spec.rb:154

Expected behavior

It should finish successfully.

Actual behavior with Ruby 3.0.2

$ bundle exec rspec ./spec/models/topic_embed_spec.rb:154
Run options: include {:locations=>{"./spec/models/topic_embed_spec.rb"=>[154]}}

Randomized with seed 41701
F

Failures:

  1) TopicEmbed.import embedded content truncation keeps everything in the imported post when truncation is disabled
     Failure/Error: expect(post.raw).to include(long_content)

       expected "<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...for the original entry at <a href=\"http://eviltrout.com/123\">http://eviltrout.com/123</a></small>" to include "<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...for the original entry at <a href='http://eviltrout.com/123'>http://eviltrout.com/123</a></small>\n"
       Diff:
       @@ -1,4 +1,7 @@
       -<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>\n<p>more</p>\n<hr>\n<small>This is a companion discussion topic for the original entry at <a href='http://eviltrout.com/123'>http://eviltrout.com/123</a></small>\n
       +<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>
       +<p>more</p>
       +<hr>
       +<small>This is a companion discussion topic for the original entry at <a href="http://eviltrout.com/123">http://eviltrout.com/123</a></small>

     # ./spec/models/topic_embed_spec.rb:158:in `block (4 levels) in <top (required)>'
     # ./spec/rails_helper.rb:279:in `block (2 levels) in <top (required)>'

Finished in 2.48 seconds (files took 2.73 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/models/topic_embed_spec.rb:154 # TopicEmbed.import embedded content truncation keeps everything in the imported post when truncation is disabled

Randomized with seed 41701

$

Actual behavior with Ruby 2.7.4

$ ruby -v
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x86_64-linux]
$ bundle exec rspec ./spec/models/topic_embed_spec.rb:154
Run options: include {:locations=>{"./spec/models/topic_embed_spec.rb"=>[154]}}

Randomized with seed 4126
.

Finished in 2.51 seconds (files took 2.31 seconds to load)
1 example, 0 failures

Randomized with seed 4126

$

What I’ve found so far

Added some print debug code and found the long_content inside the topic_embed_spec.rb was overwritten by contents of app/models/topic_embed.rb

Print debug code

$ git diff
diff --git a/app/models/topic_embed.rb b/app/models/topic_embed.rb
index 1e9f06a322..ab407d8005 100644
--- a/app/models/topic_embed.rb
+++ b/app/models/topic_embed.rb
@@ -28,6 +28,7 @@ class TopicEmbed < ActiveRecord::Base

   # Import an article from a source (RSS/Atom/Other)
   def self.import(user, url, title, contents, category_id: nil, cook_method: nil, tags: nil)
+    p "contents.length at the begining of self.import: #{contents.length}"
     return unless url =~ /^https?\:\/\//

     if SiteSetting.embed_truncate && cook_method.nil?
@@ -102,6 +103,7 @@ class TopicEmbed < ActiveRecord::Base
       end
     end

+    p "contents.length at the end of self.import: #{contents.length}"
     post
   end

diff --git a/spec/models/topic_embed_spec.rb b/spec/models/topic_embed_spec.rb
index 8659eb7d56..1266441da7 100644
--- a/spec/models/topic_embed_spec.rb
+++ b/spec/models/topic_embed_spec.rb
@@ -153,7 +153,9 @@ describe TopicEmbed do

       it 'keeps everything in the imported post when truncation is disabled' do
         SiteSetting.embed_truncate = false
+        p "long_content.length before calling TopicEmbed.import: #{long_content.length}"
         post = TopicEmbed.import(user, url, title, long_content)
+        p "long_content.length after calling TopicEmbed.import: #{long_content.length}"

         expect(post.raw).to include(long_content)
       end
$

Result with Ruby 3.0.2

$ bundle exec rspec ./spec/models/topic_embed_spec.rb:154
Run options: include {:locations=>{"./spec/models/topic_embed_spec.rb"=>[154]}}

Randomized with seed 24988
"long_content.length before calling TopicEmbed.import: 119"
"contents.length at the begining of self.import: 119"
"contents.length at the end of self.import: 267"
"long_content.length after calling TopicEmbed.import: 267"
F

Failures:

  1) TopicEmbed.import embedded content truncation keeps everything in the imported post when truncation is disabled
     Failure/Error: expect(post.raw).to include(long_content)

       expected "<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...for the original entry at <a href=\"http://eviltrout.com/123\">http://eviltrout.com/123</a></small>" to include "<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...for the original entry at <a href='http://eviltrout.com/123'>http://eviltrout.com/123</a></small>\n"
       Diff:
       @@ -1,4 +1,7 @@
       -<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>\n<p>more</p>\n<hr>\n<small>This is a companion discussion topic for the original entry at <a href='http://eviltrout.com/123'>http://eviltrout.com/123</a></small>\n
       +<p>aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa</p>
       +<p>more</p>
       +<hr>
       +<small>This is a companion discussion topic for the original entry at <a href="http://eviltrout.com/123">http://eviltrout.com/123</a></small>

     # ./spec/models/topic_embed_spec.rb:160:in `block (4 levels) in <top (required)>'
     # ./spec/rails_helper.rb:279:in `block (2 levels) in <top (required)>'

Finished in 2.55 seconds (files took 3.4 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/models/topic_embed_spec.rb:154 # TopicEmbed.import embedded content truncation keeps everything in the imported post when truncation is disabled

Randomized with seed 24988

$

Result with Ruby 2.7.4

$ bundle exec rspec ./spec/models/topic_embed_spec.rb:154
Run options: include {:locations=>{"./spec/models/topic_embed_spec.rb"=>[154]}}

Randomized with seed 1802
"long_content.length before calling TopicEmbed.import: 119"
"contents.length at the begining of self.import: 119"
"contents.length at the end of self.import: 267"
"long_content.length after calling TopicEmbed.import: 119"
.

Finished in 2.56 seconds (files took 3.06 seconds to load)
1 example, 0 failures

Randomized with seed 1802

$
3 Likes

I’ve got why this spec fails only with Ruby 3.0.2.

Since Ruby 3, interpolated strings are not frozen even if frozen-string-literal magic comment is enabled.
https://bugs.ruby-lang.org/issues/17104

So here are possible ways to fix it.

  1. Use contens.dup instead of +contents
$ git diff
diff --git a/app/models/topic_embed.rb b/app/models/topic_embed.rb
index 1e9f06a322..502409d3c1 100644
--- a/app/models/topic_embed.rb
+++ b/app/models/topic_embed.rb
@@ -34,7 +34,7 @@ class TopicEmbed < ActiveRecord::Base
       contents = first_paragraph_from(contents)
     end
     contents ||= ''
-    contents = +contents << imported_from_html(url)
+    contents = contents.dup << imported_from_html(url)

     url = normalize_url(url)

$
  1. Freeze interpolated long_content
diff --git a/spec/models/topic_embed_spec.rb b/spec/models/topic_embed_spec.rb
index 8659eb7d56..a25351a586 100644
--- a/spec/models/topic_embed_spec.rb
+++ b/spec/models/topic_embed_spec.rb
@@ -142,7 +142,7 @@ describe TopicEmbed do
     describe 'embedded content truncation' do
       MAX_LENGTH_BEFORE_TRUNCATION = 100

-      let(:long_content) { "<p>#{'a' * MAX_LENGTH_BEFORE_TRUNCATION}</p>\n<p>more</p>" }
+      let(:long_content) { "<p>#{'a' * MAX_LENGTH_BEFORE_TRUNCATION}</p>\n<p>more</p>".freeze }

       it 'truncates the imported post when truncation is enabled' do
         SiteSetting.embed_truncate = true
$

I’m inclined to open a pull request to use contens.dup and found the idea was implemented but reverted via Revert "FIX: Use #dup instead of #+@ since content could be an instan… · discourse/discourse@d01c938 · GitHub

I did not find some other backgrounds for the revert, I’d like to hear from contributors/maintainers.

2 Likes

Opened FIX: Ruby 3 does not freeze interpolated string by yahonda · Pull Request #14567 · discourse/discourse · GitHub

2 Likes