Parsing a API source, but the JSON souse is not updating (Cache?)

I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data

https://www.thegazette.co.uk/company/07877158/filings/data.json

The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information

The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)

I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites

My rake code is as follows

desc "Monitor"
task :S_01 => :environment do

 require 'rubygems'
 require 'open-uri'
 require 'openssl'

 def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end
 data = JSON.parse(response.body)

 end

 company = CompanyBorrower.where(id: 43)

 company.each do |f|

 begin

 #scrape source
 tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
 gf_scrape = g_api(tg_fh_url)

 ch_s = gf_scrape.fetch('total_count', nil) #scrape

 puts ch_s

 if not f.filing_count == ch_s # has teh cound change - if not, skip

 f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

 gf_scrape['items'].first(3).each_with_index do |f1, index|

 #fetch & save data here

 end

 end

 rescue
 next
 end

 end

end

EDIT

Added the following to the code, but get an error

 response["Cache-Control: no-cache"]

NoMethodError: undefined method `fetch' for nil:NilClass

def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end

 data = JSON.parse(response.body)

 response["Cache-Control: no-cache"]

end

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

2

To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55

1

For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57

2

You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25

1

Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27

1

It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01

|
show 5 more comments

I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data

https://www.thegazette.co.uk/company/07877158/filings/data.json

The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information

I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites

My rake code is as follows

desc "Monitor"
task :S_01 => :environment do

 require 'rubygems'
 require 'open-uri'
 require 'openssl'

 def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end
 data = JSON.parse(response.body)

 end

 company = CompanyBorrower.where(id: 43)

 company.each do |f|

 begin

 #scrape source
 tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
 gf_scrape = g_api(tg_fh_url)

 ch_s = gf_scrape.fetch('total_count', nil) #scrape

 puts ch_s

 if not f.filing_count == ch_s # has teh cound change - if not, skip

 f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

 gf_scrape['items'].first(3).each_with_index do |f1, index|

 #fetch & save data here

 end

 end

 rescue
 next
 end

 end

end

EDIT

Added the following to the code, but get an error

 response["Cache-Control: no-cache"]

NoMethodError: undefined method `fetch' for nil:NilClass

def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end

 data = JSON.parse(response.body)

 response["Cache-Control: no-cache"]

end

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

2

To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55

1

For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57

2

You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25

1

Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27

1

It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01

|
show 5 more comments

I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data

https://www.thegazette.co.uk/company/07877158/filings/data.json

The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information

I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites

My rake code is as follows

desc "Monitor"
task :S_01 => :environment do

 require 'rubygems'
 require 'open-uri'
 require 'openssl'

 def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end
 data = JSON.parse(response.body)

 end

 company = CompanyBorrower.where(id: 43)

 company.each do |f|

 begin

 #scrape source
 tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
 gf_scrape = g_api(tg_fh_url)

 ch_s = gf_scrape.fetch('total_count', nil) #scrape

 puts ch_s

 if not f.filing_count == ch_s # has teh cound change - if not, skip

 f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

 gf_scrape['items'].first(3).each_with_index do |f1, index|

 #fetch & save data here

 end

 end

 rescue
 next
 end

 end

end

EDIT

Added the following to the code, but get an error

 response["Cache-Control: no-cache"]

NoMethodError: undefined method `fetch' for nil:NilClass

def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end

 data = JSON.parse(response.body)

 response["Cache-Control: no-cache"]

end

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data

https://www.thegazette.co.uk/company/07877158/filings/data.json

The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information

I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites

My rake code is as follows

desc "Monitor"
task :S_01 => :environment do

 require 'rubygems'
 require 'open-uri'
 require 'openssl'

 def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end
 data = JSON.parse(response.body)

 end

 company = CompanyBorrower.where(id: 43)

 company.each do |f|

 begin

 #scrape source
 tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
 gf_scrape = g_api(tg_fh_url)

 ch_s = gf_scrape.fetch('total_count', nil) #scrape

 puts ch_s

 if not f.filing_count == ch_s # has teh cound change - if not, skip

 f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

 gf_scrape['items'].first(3).each_with_index do |f1, index|

 #fetch & save data here

 end

 end

 rescue
 next
 end

 end

end

EDIT

Added the following to the code, but get an error

 response["Cache-Control: no-cache"]

NoMethodError: undefined method `fetch' for nil:NilClass

def g_api(url)
 uri = URI.parse(url)
 request = Net::HTTP::Get.new(uri)
 request.content_type = "application/json"
 req_options = 
 use_ssl: uri.scheme == "https",
 
 response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
 http.request(request)
 end

 data = JSON.parse(response.body)

 response["Cache-Control: no-cache"]

end

ruby-on-rails json ruby rake

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

edited Nov 12 '18 at 12:12

asked Nov 12 '18 at 11:08

JimBob

527

asked Nov 12 '18 at 11:08

JimBob

527

asked Nov 12 '18 at 11:08

JimBob

527

2

To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55

1

For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57

2

You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25

1

Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27

1

It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01

|
show 5 more comments

2

To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55

1

For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57

2

You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25

1

Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27

1

It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01

To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55

For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57

You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25

Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27

It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01

|
show 5 more comments

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260887%2fparsing-a-api-source-but-the-json-souse-is-not-updating-cache%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dfyjkt