Parsing a API source, but the JSON souse is not updating (Cache?)
I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data
https://www.thegazette.co.uk/company/07877158/filings/data.json
The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information
The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)
I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites
My rake code is as follows
desc "Monitor"
task :S_01 => :environment do
require 'rubygems'
require 'open-uri'
require 'openssl'
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
end
company = CompanyBorrower.where(id: 43)
company.each do |f|
begin
#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)
ch_s = gf_scrape.fetch('total_count', nil) #scrape
puts ch_s
if not f.filing_count == ch_s # has teh cound change - if not, skip
f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)
gf_scrape['items'].first(3).each_with_index do |f1, index|
#fetch & save data here
end
end
rescue
next
end
end
end
EDIT
Added the following to the code, but get an error
response["Cache-Control: no-cache"]
NoMethodError: undefined method `fetch' for nil:NilClass
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
response["Cache-Control: no-cache"]
end
ruby-on-rails json ruby rake
|
show 5 more comments
I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data
https://www.thegazette.co.uk/company/07877158/filings/data.json
The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information
The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)
I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites
My rake code is as follows
desc "Monitor"
task :S_01 => :environment do
require 'rubygems'
require 'open-uri'
require 'openssl'
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
end
company = CompanyBorrower.where(id: 43)
company.each do |f|
begin
#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)
ch_s = gf_scrape.fetch('total_count', nil) #scrape
puts ch_s
if not f.filing_count == ch_s # has teh cound change - if not, skip
f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)
gf_scrape['items'].first(3).each_with_index do |f1, index|
#fetch & save data here
end
end
rescue
next
end
end
end
EDIT
Added the following to the code, but get an error
response["Cache-Control: no-cache"]
NoMethodError: undefined method `fetch' for nil:NilClass
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
response["Cache-Control: no-cache"]
end
ruby-on-rails json ruby rake
2
To cut that down the a Minimal example, are you saying thatNet::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide aCache-Control: no-cache
header in the request?
– Tom Lord
Nov 12 '18 at 11:55
1
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
2
You need to set theCache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…
– Tom Lord
Nov 12 '18 at 12:25
1
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
1
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01
|
show 5 more comments
I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data
https://www.thegazette.co.uk/company/07877158/filings/data.json
The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information
The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)
I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites
My rake code is as follows
desc "Monitor"
task :S_01 => :environment do
require 'rubygems'
require 'open-uri'
require 'openssl'
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
end
company = CompanyBorrower.where(id: 43)
company.each do |f|
begin
#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)
ch_s = gf_scrape.fetch('total_count', nil) #scrape
puts ch_s
if not f.filing_count == ch_s # has teh cound change - if not, skip
f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)
gf_scrape['items'].first(3).each_with_index do |f1, index|
#fetch & save data here
end
end
rescue
next
end
end
end
EDIT
Added the following to the code, but get an error
response["Cache-Control: no-cache"]
NoMethodError: undefined method `fetch' for nil:NilClass
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
response["Cache-Control: no-cache"]
end
ruby-on-rails json ruby rake
I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data
https://www.thegazette.co.uk/company/07877158/filings/data.json
The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information
The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)
I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites
My rake code is as follows
desc "Monitor"
task :S_01 => :environment do
require 'rubygems'
require 'open-uri'
require 'openssl'
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
end
company = CompanyBorrower.where(id: 43)
company.each do |f|
begin
#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)
ch_s = gf_scrape.fetch('total_count', nil) #scrape
puts ch_s
if not f.filing_count == ch_s # has teh cound change - if not, skip
f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)
gf_scrape['items'].first(3).each_with_index do |f1, index|
#fetch & save data here
end
end
rescue
next
end
end
end
EDIT
Added the following to the code, but get an error
response["Cache-Control: no-cache"]
NoMethodError: undefined method `fetch' for nil:NilClass
def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)
response["Cache-Control: no-cache"]
end
ruby-on-rails json ruby rake
ruby-on-rails json ruby rake
edited Nov 12 '18 at 12:12
JimBob
asked Nov 12 '18 at 11:08
JimBobJimBob
527
527
2
To cut that down the a Minimal example, are you saying thatNet::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide aCache-Control: no-cache
header in the request?
– Tom Lord
Nov 12 '18 at 11:55
1
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
2
You need to set theCache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…
– Tom Lord
Nov 12 '18 at 12:25
1
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
1
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01
|
show 5 more comments
2
To cut that down the a Minimal example, are you saying thatNet::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide aCache-Control: no-cache
header in the request?
– Tom Lord
Nov 12 '18 at 11:55
1
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
2
You need to set theCache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…
– Tom Lord
Nov 12 '18 at 12:25
1
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
1
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01
2
2
To cut that down the a Minimal example, are you saying that
Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache
header in the request?– Tom Lord
Nov 12 '18 at 11:55
To cut that down the a Minimal example, are you saying that
Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache
header in the request?– Tom Lord
Nov 12 '18 at 11:55
1
1
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
2
2
You need to set the
Cache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…– Tom Lord
Nov 12 '18 at 12:25
You need to set the
Cache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…– Tom Lord
Nov 12 '18 at 12:25
1
1
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
1
1
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01
|
show 5 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260887%2fparsing-a-api-source-but-the-json-souse-is-not-updating-cache%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260887%2fparsing-a-api-source-but-the-json-souse-is-not-updating-cache%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
To cut that down the a Minimal example, are you saying that
Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/..."))
gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide aCache-Control: no-cache
header in the request?– Tom Lord
Nov 12 '18 at 11:55
1
For full details, see section 14.9 of the HTTP/1.1 protocol
– Tom Lord
Nov 12 '18 at 11:57
2
You need to set the
Cache-Control
in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…– Tom Lord
Nov 12 '18 at 12:25
1
Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.
– Tom Lord
Nov 12 '18 at 12:27
1
It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md
– lacostenycoder
Nov 12 '18 at 13:01