Parsing a API source, but the JSON souse is not updating (Cache?)










0















I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data



https://www.thegazette.co.uk/company/07877158/filings/data.json



The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information



The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)



I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites



My rake code is as follows



desc "Monitor"
task :S_01 => :environment do

require 'rubygems'
require 'open-uri'
require 'openssl'

def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)

end

company = CompanyBorrower.where(id: 43)

company.each do |f|

begin

#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)

ch_s = gf_scrape.fetch('total_count', nil) #scrape

puts ch_s

if not f.filing_count == ch_s # has teh cound change - if not, skip

f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

gf_scrape['items'].first(3).each_with_index do |f1, index|

#fetch & save data here

end

end

rescue
next
end

end

end


EDIT



Added the following to the code, but get an error



 response["Cache-Control: no-cache"]



NoMethodError: undefined method `fetch' for nil:NilClass




def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end

data = JSON.parse(response.body)

response["Cache-Control: no-cache"]

end









share|improve this question



















  • 2





    To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

    – Tom Lord
    Nov 12 '18 at 11:55






  • 1





    For full details, see section 14.9 of the HTTP/1.1 protocol

    – Tom Lord
    Nov 12 '18 at 11:57






  • 2





    You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

    – Tom Lord
    Nov 12 '18 at 12:25






  • 1





    Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

    – Tom Lord
    Nov 12 '18 at 12:27






  • 1





    It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

    – lacostenycoder
    Nov 12 '18 at 13:01
















0















I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data



https://www.thegazette.co.uk/company/07877158/filings/data.json



The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information



The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)



I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites



My rake code is as follows



desc "Monitor"
task :S_01 => :environment do

require 'rubygems'
require 'open-uri'
require 'openssl'

def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)

end

company = CompanyBorrower.where(id: 43)

company.each do |f|

begin

#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)

ch_s = gf_scrape.fetch('total_count', nil) #scrape

puts ch_s

if not f.filing_count == ch_s # has teh cound change - if not, skip

f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

gf_scrape['items'].first(3).each_with_index do |f1, index|

#fetch & save data here

end

end

rescue
next
end

end

end


EDIT



Added the following to the code, but get an error



 response["Cache-Control: no-cache"]



NoMethodError: undefined method `fetch' for nil:NilClass




def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end

data = JSON.parse(response.body)

response["Cache-Control: no-cache"]

end









share|improve this question



















  • 2





    To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

    – Tom Lord
    Nov 12 '18 at 11:55






  • 1





    For full details, see section 14.9 of the HTTP/1.1 protocol

    – Tom Lord
    Nov 12 '18 at 11:57






  • 2





    You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

    – Tom Lord
    Nov 12 '18 at 12:25






  • 1





    Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

    – Tom Lord
    Nov 12 '18 at 12:27






  • 1





    It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

    – lacostenycoder
    Nov 12 '18 at 13:01














0












0








0








I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data



https://www.thegazette.co.uk/company/07877158/filings/data.json



The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information



The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)



I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites



My rake code is as follows



desc "Monitor"
task :S_01 => :environment do

require 'rubygems'
require 'open-uri'
require 'openssl'

def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)

end

company = CompanyBorrower.where(id: 43)

company.each do |f|

begin

#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)

ch_s = gf_scrape.fetch('total_count', nil) #scrape

puts ch_s

if not f.filing_count == ch_s # has teh cound change - if not, skip

f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

gf_scrape['items'].first(3).each_with_index do |f1, index|

#fetch & save data here

end

end

rescue
next
end

end

end


EDIT



Added the following to the code, but get an error



 response["Cache-Control: no-cache"]



NoMethodError: undefined method `fetch' for nil:NilClass




def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end

data = JSON.parse(response.body)

response["Cache-Control: no-cache"]

end









share|improve this question
















I have coded a rake file to monitor and fetch data from a website that has this data in JSON format. The following is the actual source of this data



https://www.thegazette.co.uk/company/07877158/filings/data.json



The rake file monitors the "total_count" in the above json and when it changes the rake will fetch and save any new information



The issue I have is after the first time it monitors that page, it simply doesn't update. As a real-world current example, the above json source was updated overnight with two new records, and consequently, the "total_count" increased from 40 to 42, but my rake is still telling me there is 40 (and subsequently doing nothing because it thinks nothing has changed)



I think it is a cache issue but have cleared my rails cache with no success. It is strange because I don't have this issue with other similar rakes I have created for other sites



My rake code is as follows



desc "Monitor"
task :S_01 => :environment do

require 'rubygems'
require 'open-uri'
require 'openssl'

def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
data = JSON.parse(response.body)

end

company = CompanyBorrower.where(id: 43)

company.each do |f|

begin

#scrape source
tg_fh_url = "https://www.thegazette.co.uk/company/"+f.ch+"/filings/data.json"
gf_scrape = g_api(tg_fh_url)

ch_s = gf_scrape.fetch('total_count', nil) #scrape

puts ch_s

if not f.filing_count == ch_s # has teh cound change - if not, skip

f.update_attributes(cwdetail1: ch_s, filing_update: ch_fh3)

gf_scrape['items'].first(3).each_with_index do |f1, index|

#fetch & save data here

end

end

rescue
next
end

end

end


EDIT



Added the following to the code, but get an error



 response["Cache-Control: no-cache"]



NoMethodError: undefined method `fetch' for nil:NilClass




def g_api(url)
uri = URI.parse(url)
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
req_options =
use_ssl: uri.scheme == "https",

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end

data = JSON.parse(response.body)

response["Cache-Control: no-cache"]

end






ruby-on-rails json ruby rake






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 12:12







JimBob

















asked Nov 12 '18 at 11:08









JimBobJimBob

527




527







  • 2





    To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

    – Tom Lord
    Nov 12 '18 at 11:55






  • 1





    For full details, see section 14.9 of the HTTP/1.1 protocol

    – Tom Lord
    Nov 12 '18 at 11:57






  • 2





    You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

    – Tom Lord
    Nov 12 '18 at 12:25






  • 1





    Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

    – Tom Lord
    Nov 12 '18 at 12:27






  • 1





    It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

    – lacostenycoder
    Nov 12 '18 at 13:01













  • 2





    To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

    – Tom Lord
    Nov 12 '18 at 11:55






  • 1





    For full details, see section 14.9 of the HTTP/1.1 protocol

    – Tom Lord
    Nov 12 '18 at 11:57






  • 2





    You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

    – Tom Lord
    Nov 12 '18 at 12:25






  • 1





    Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

    – Tom Lord
    Nov 12 '18 at 12:27






  • 1





    It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

    – lacostenycoder
    Nov 12 '18 at 13:01








2




2





To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55





To cut that down the a Minimal example, are you saying that Net::HTTP::Get.new(URI.parse("https://..../company/07877158/filings/...")) gives you a different result vs accessing that URL from a browser? Perhaps the solution is to provide a Cache-Control: no-cache header in the request?

– Tom Lord
Nov 12 '18 at 11:55




1




1





For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57





For full details, see section 14.9 of the HTTP/1.1 protocol

– Tom Lord
Nov 12 '18 at 11:57




2




2





You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25





You need to set the Cache-Control in the request. It's not a lookup in the response. See the documentation: docs.ruby-lang.org/en/2.0.0/Net/…

– Tom Lord
Nov 12 '18 at 12:25




1




1





Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27





Or for instance, you can see some examples of setting headers here: yukimotopress.github.io/http. Again, don't get mixed up between reading the headers in the response, vs setting the headers in your request.

– Tom Lord
Nov 12 '18 at 12:27




1




1





It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01






It looks like even passing the correct headers gives the same result. I would suggest to have a look at github.com/TheGazette/DevDocs/blob/master/home.md

– lacostenycoder
Nov 12 '18 at 13:01













0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260887%2fparsing-a-api-source-but-the-json-souse-is-not-updating-cache%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260887%2fparsing-a-api-source-but-the-json-souse-is-not-updating-cache%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)