la_vecchia_wiki_moinmoin/CostruzioneUtensili.html

202 lines
30 KiB
HTML
Raw Permalink Normal View History

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>CostruzioneUtensili</title>
<link rel="stylesheet" type="text/css" media="all" charset="utf-8" href="acaro/css/common.css">
<link rel="stylesheet" type="text/css" media="screen" charset="utf-8" href="acaro/css/screen.css">
<link rel="stylesheet" type="text/css" media="print" charset="utf-8" href="acaro/css/print.css">
<style type="text/css">
ul.pagetitle{
display: inline;
margin: 0;
padding: 0;
font-size: 1.5em;
}
li.pagetitle{
display: inline;
margin: 0;
}
td.noborder {
border: 0;
}
</style>
</head>
<body>
<table>
<tr>
<td class="noborder">
<img src="logo.png" width="85" height="85">
</td>
<td class="noborder">
<ul class="pagetitle">
<li class="pagetitle"><a class="backlink">CostruzioneUtensili</a>
</ul>
<br><br>
[<a href="FrontPage.html">FrontPage</a>]
</td>
</tr>
</table>
<hr>
<div id="page">
<div dir="ltr" id="content" lang="it"><span class="anchor" id="top"></span>
<span class="anchor" id="line-1-6"></span><span class="anchor" id="line-2"></span><span class="anchor" id="line-3"></span><p class="line867">
<h1 id="Costruzione_Utensili">Costruzione Utensili</h1>
<span class="anchor" id="line-4"></span><span class="anchor" id="line-5"></span><p class="line874">La Cultura è la nostra Natura, siamo cacciatori e raccoglitori in un mondo di informazione. <span class="anchor" id="line-6"></span><span class="anchor" id="line-7"></span><p class="line867">
<h2 id="Prerequisiti">Prerequisiti</h2>
<span class="anchor" id="line-8"></span><ul><li>Un'idea vaga di HTML <span class="anchor" id="line-9"></span></li><li>Saper scrivere, o anche solo leggere un qualsiasi linguaggio <span class="anchor" id="line-10"></span><span class="anchor" id="line-11"></span></li></ul><p class="line867">
<h2 id="Programma">Programma</h2>
<span class="anchor" id="line-12"></span><span class="anchor" id="line-13"></span><p class="line874">Serie di pomeriggi di sperimentazione libera, segue workshop rivolto al pubblico. <span class="anchor" id="line-14"></span><span class="anchor" id="line-15"></span><p class="line867">
<h2 id="Temi">Temi</h2>
<span class="anchor" id="line-16"></span><span class="anchor" id="line-17"></span><p class="line874">Ancora da definire, ma a grandi linee: <span class="anchor" id="line-18"></span><ul><li>Orientarsi con l'inspector del Browser <span class="anchor" id="line-19"></span></li><li>Rudimenti di web scraping con Python: <span class="anchor" id="line-20"></span><ul><li>GET e fake-user agent con requests <span class="anchor" id="line-21"></span></li><li>Beautiful-soup e/o lxml per il parsing delle pagine <span class="anchor" id="line-22"></span></li><li>Web spider con scrapy <span class="anchor" id="line-23"></span></li></ul></li><li>wget e qualcosa di bash? <span class="anchor" id="line-24"></span><span class="anchor" id="line-25"></span></li></ul><p class="line867">
<h2 id="Riferimenti_Sparsi">Riferimenti Sparsi</h2>
<span class="anchor" id="line-26"></span><ul><li><p class="line891"><a class="https" href="https://elitedatascience.com/python-web-scraping-libraries">https://elitedatascience.com/python-web-scraping-libraries</a> <span class="anchor" id="line-27"></span></li><li><p class="line891"><a class="https" href="https://first-web-scraper.readthedocs.io/en/latest/">https://first-web-scraper.readthedocs.io/en/latest/</a> <span class="anchor" id="line-28"></span></li><li><p class="line891"><a class="https" href="https://medium.com/@kaismh/extracting-data-from-websites-using-scrapy-e1e1e357651a">https://medium.com/@kaismh/extracting-data-from-websites-using-scrapy-e1e1e357651a</a> <span class="anchor" id="line-29"></span></li><li><p class="line891"><a class="https" href="https://deshmukhsuraj.wordpress.com/2015/03/08/anonymous-web-scraping-using-python-and-tor/">https://deshmukhsuraj.wordpress.com/2015/03/08/anonymous-web-scraping-using-python-and-tor/</a> <span class="anchor" id="line-30"></span><span class="anchor" id="line-31"></span></li></ul><p class="line867">
<h2 id="Terminale">Terminale</h2>
<span class="anchor" id="line-32"></span><span class="anchor" id="line-33"></span><p class="line867">
<h3 id="curl">curl</h3>
<span class="anchor" id="line-34"></span><p class="line867"><span class="anchor" id="line-35"></span><span class="anchor" id="line-36"></span><pre><span class="anchor" id="line-1"></span>curl "http://www.example.com"</pre><span class="anchor" id="line-37"></span><p class="line874">esegue una GET e ne stampa l'output <span class="anchor" id="line-38"></span><span class="anchor" id="line-39"></span><p class="line867"><span class="anchor" id="line-40"></span><span class="anchor" id="line-41"></span><pre><span class="anchor" id="line-1-1"></span>curl -o out.html "http://www.example.com"</pre><span class="anchor" id="line-42"></span><p class="line862">ora l'output è salvato sul file <em>out.html</em> <span class="anchor" id="line-43"></span><span class="anchor" id="line-44"></span><p class="line867">
<h3 id="wget">wget</h3>
<span class="anchor" id="line-45"></span><p class="line867"><span class="anchor" id="line-46"></span><span class="anchor" id="line-47"></span><pre><span class="anchor" id="line-1-2"></span>wget "http://www.example.com/index.html"</pre><span class="anchor" id="line-48"></span><p class="line862">salva in contenuto in <em>index.html</em> <span class="anchor" id="line-49"></span><span class="anchor" id="line-50"></span><p class="line867"><span class="anchor" id="line-51"></span><span class="anchor" id="line-52"></span><pre><span class="anchor" id="line-1-3"></span>wget -r "http://www.example.com/"</pre><span class="anchor" id="line-53"></span><p class="line862">salva <strong>tutto</strong> il contenuto del sito nella directory corrente <span class="anchor" id="line-54"></span><span class="anchor" id="line-55"></span><p class="line867">
<h3 id="Python">Python</h3>
<span class="anchor" id="line-56"></span><p class="line867"><span class="anchor" id="line-57"></span><span class="anchor" id="line-58"></span><pre><span class="anchor" id="line-1-4"></span>python3 script.py</pre><span class="anchor" id="line-59"></span><p class="line874">esegue uno script <span class="anchor" id="line-60"></span><span class="anchor" id="line-61"></span><p class="line867"><span class="anchor" id="line-62"></span><span class="anchor" id="line-63"></span><pre><span class="anchor" id="line-1-5"></span>python3 script.py &gt; out.txt</pre><span class="anchor" id="line-64"></span><p class="line862">esegue uno script e ne salva l'output in <em>out.txt</em> <span class="anchor" id="line-65"></span><span class="anchor" id="line-66"></span><p class="line867">
<h2 id="Codice">Codice</h2>
<span class="anchor" id="line-67"></span><span class="anchor" id="line-68"></span><p class="line867">
<h3 id="Scraping">Scraping</h3>
<span class="anchor" id="line-69"></span><p class="line874">Stampa l'elenco degli spazi di Macao: <span class="anchor" id="line-70"></span><span class="anchor" id="line-71"></span><span class="anchor" id="line-72"></span><span class="anchor" id="line-73"></span><span class="anchor" id="line-74"></span><span class="anchor" id="line-75"></span><span class="anchor" id="line-76"></span><span class="anchor" id="line-77"></span><span class="anchor" id="line-78"></span><span class="anchor" id="line-79"></span><span class="anchor" id="line-80"></span><span class="anchor" id="line-81"></span><span class="anchor" id="line-82"></span><span class="anchor" id="line-83"></span><span class="anchor" id="line-84"></span><span class="anchor" id="line-85"></span><span class="anchor" id="line-1-7"></span><div class="highlight python3"><div class="codearea" dir="ltr" lang="en">
<script type="text/javascript">
function isnumbered(obj) {
return obj.childNodes.length && obj.firstChild.childNodes.length && obj.firstChild.firstChild.className == 'LineNumber';
}
function nformat(num,chrs,add) {
var nlen = Math.max(0,chrs-(''+num).length), res = '';
while (nlen>0) { res += ' '; nlen-- }
return res+num+add;
}
function addnumber(did, nstart, nstep) {
var c = document.getElementById(did), l = c.firstChild, n = 1;
if (!isnumbered(c)) {
if (typeof nstart == 'undefined') nstart = 1;
if (typeof nstep == 'undefined') nstep = 1;
var n = nstart;
while (l != null) {
if (l.tagName == 'SPAN') {
var s = document.createElement('SPAN');
var a = document.createElement('A');
s.className = 'LineNumber';
a.appendChild(document.createTextNode(nformat(n,4,'')));
a.href = '#' + did + '_' + n;
s.appendChild(a);
s.appendChild(document.createTextNode(' '));
n += nstep;
if (l.childNodes.length) {
l.insertBefore(s, l.firstChild);
}
else {
l.appendChild(s);
}
}
l = l.nextSibling;
}
}
return false;
}
function remnumber(did) {
var c = document.getElementById(did), l = c.firstChild;
if (isnumbered(c)) {
while (l != null) {
if (l.tagName == 'SPAN' && l.firstChild.className == 'LineNumber') l.removeChild(l.firstChild);
l = l.nextSibling;
}
}
return false;
}
function togglenumber(did, nstart, nstep) {
var c = document.getElementById(did);
if (isnumbered(c)) {
remnumber(did);
} else {
addnumber(did,nstart,nstep);
}
return false;
}
</script>
<script type="text/javascript">
document.write('<a href="#" onclick="return togglenumber(\'CA-9111a916e892a1f257425d0cede6cf32f9811e1c\', 1, 1);" \
class="codenumbers">Toggle line numbers<\/a>');
</script>
<pre dir="ltr" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c" lang="en"><span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_1"> 1</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_1"></span><span class="anchor" id="line-1-8"></span><span class="Comment">#!/usr/bin/env python3</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_2"> 2</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_2"></span><span class="anchor" id="line-2-1"></span><span class="ResWord">import</span> <span class="ID">requests</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_3"> 3</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_3"></span><span class="anchor" id="line-3-1"></span><span class="ResWord">from</span> <span class="ID">bs4</span> <span class="ResWord">import</span> <span class="ID">BeautifulSoup</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_4"> 4</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_4"></span><span class="anchor" id="line-4-1"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_5"> 5</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_5"></span><span class="anchor" id="line-5-1"></span><span class="ID">url</span> = <span class="String">"</span><span class="String">http://www.macaomilano.org/spip.php?rubrique18</span><span class="String">"</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_6"> 6</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_6"></span><span class="anchor" id="line-6-1"></span><span class="ID">r</span> = <span class="ID">requests</span>.<span class="ID">get</span>(<span class="ID">url</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_7"> 7</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_7"></span><span class="anchor" id="line-7-1"></span><span class="ID">page</span> = <span class="ID">r</span>.<span class="ID">text</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_8"> 8</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_8"></span><span class="anchor" id="line-8-1"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_9"> 9</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_9"></span><span class="anchor" id="line-9-1"></span><span class="ID">soup</span> = <span class="ID">BeautifulSoup</span>(<span class="ID">page</span>, <span class="String">"</span><span class="String">html.parser</span><span class="String">"</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_10"> 10</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_10"></span><span class="anchor" id="line-10-1"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_11"> 11</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_11"></span><span class="anchor" id="line-11-1"></span><span class="ID">h2s</span> = <span class="ID">soup</span>.<span class="ID">findAll</span>(<span class="String">"</span><span class="String">h2</span><span class="String">"</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_12"> 12</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_12"></span><span class="anchor" id="line-12-1"></span><span class="ID">spazi</span> = [<span class="ID">h2</span>.<span class="ID">text</span> <span class="ResWord">for</span> <span class="ID">h2</span> <span class="ResWord">in</span> <span class="ID">h2s</span>]</span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_13"> 13</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_13"></span><span class="anchor" id="line-13-1"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-9111a916e892a1f257425d0cede6cf32f9811e1c_14"> 14</a> </span><span class="LineAnchor" id="CA-9111a916e892a1f257425d0cede6cf32f9811e1c_14"></span><span class="anchor" id="line-14-1"></span><span class="ResWord">print</span>(<span class="String">"</span><span class="SPChar">\n</span><span class="String">"</span>.<span class="ID">join</span>(<span class="ID">spazi</span>))</span>
</pre></div></div><span class="anchor" id="line-86"></span><span class="anchor" id="line-87"></span><p class="line862">Stesso raccolto, ma con XPath <img alt=":)" height="16" src="/moin_static1911/acaro/img/smile.png" title=":)" width="16" /> <span class="anchor" id="line-88"></span><span class="anchor" id="line-89"></span><span class="anchor" id="line-90"></span><span class="anchor" id="line-91"></span><span class="anchor" id="line-92"></span><span class="anchor" id="line-93"></span><span class="anchor" id="line-94"></span><span class="anchor" id="line-95"></span><span class="anchor" id="line-96"></span><span class="anchor" id="line-97"></span><span class="anchor" id="line-98"></span><span class="anchor" id="line-99"></span><span class="anchor" id="line-100"></span><span class="anchor" id="line-101"></span><span class="anchor" id="line-102"></span><span class="anchor" id="line-103"></span><span class="anchor" id="line-104"></span><span class="anchor" id="line-105"></span><span class="anchor" id="line-106"></span><span class="anchor" id="line-1-9"></span><div class="highlight python3"><div class="codearea" dir="ltr" lang="en">
<script type="text/javascript">
document.write('<a href="#" onclick="return togglenumber(\'CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9\', 1, 1);" \
class="codenumbers">Toggle line numbers<\/a>');
</script>
<pre dir="ltr" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9" lang="en"><span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_1"> 1</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_1"></span><span class="anchor" id="line-1-10"></span><span class="Comment">#!/usr/bin/env python</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_2"> 2</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_2"></span><span class="anchor" id="line-2-2"></span><span class="ResWord">from</span> <span class="ID">lxml</span> <span class="ResWord">import</span> <span class="ID">html</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_3"> 3</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_3"></span><span class="anchor" id="line-3-2"></span><span class="ResWord">from</span> <span class="ID">io</span> <span class="ResWord">import</span> <span class="ID">BytesIO</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_4"> 4</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_4"></span><span class="anchor" id="line-4-2"></span><span class="ResWord">import</span> <span class="ID">requests</span> <span class="ResWord">as</span> <span class="ID">reqs</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_5"> 5</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_5"></span><span class="anchor" id="line-5-2"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_6"> 6</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_6"></span><span class="anchor" id="line-6-2"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_7"> 7</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_7"></span><span class="anchor" id="line-7-2"></span><span class="Comment"># FF XPath Plugin:</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_8"> 8</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_8"></span><span class="anchor" id="line-8-2"></span><span class="Comment"># https://addons.mozilla.org/en-US/firefox/addon/xpath-checker/</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_9"> 9</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_9"></span><span class="anchor" id="line-9-2"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_10"> 10</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_10"></span><span class="anchor" id="line-10-2"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_11"> 11</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_11"></span><span class="anchor" id="line-11-2"></span><span class="ID">url</span> = <span class="String">"</span><span class="String">http://macaomilano.org/spip.php?rubrique18</span><span class="String">"</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_12"> 12</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_12"></span><span class="anchor" id="line-12-2"></span><span class="ID">r</span> = <span class="ID">reqs</span>.<span class="ID">get</span>(<span class="ID">url</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_13"> 13</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_13"></span><span class="anchor" id="line-13-2"></span><span class="ID">doc</span> = <span class="ID">html</span>.<span class="ID">parse</span>(<span class="ID">BytesIO</span>(<span class="ID">r</span>.<span class="ID">content</span>))</span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_14"> 14</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_14"></span><span class="anchor" id="line-14-2"></span><span class="ID">titles</span> = <span class="ID">doc</span>.<span class="ID">xpath</span>(<span class="String">"</span><span class="String">id(</span><span class="String">'</span><span class="String">container</span><span class="String">'</span><span class="String">)/div/section/header/h2/a/span/text()</span><span class="String">"</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_15"> 15</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_15"></span><span class="anchor" id="line-15-1"></span><span class="ResWord">for</span> <span class="ID">t</span> <span class="ResWord">in</span> <span class="ID">titles</span>:</span>
<span class="line"><span class="LineNumber"><a href="#CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_16"> 16</a> </span><span class="LineAnchor" id="CA-23af50fad8100d6e8fbe6a70bdb68ba01bf3c8f9_16"></span><span class="anchor" id="line-16-1"></span> <span class="ResWord">print</span>(<span class="ID">t</span>)</span>
</pre></div></div><span class="anchor" id="line-107"></span><span class="anchor" id="line-108"></span><p class="line867">
<h3 id="Fake_user-agent">Fake user-agent</h3>
<span class="anchor" id="line-109"></span><p class="line867"><span class="anchor" id="line-110"></span><span class="anchor" id="line-111"></span><span class="anchor" id="line-112"></span><span class="anchor" id="line-113"></span><span class="anchor" id="line-114"></span><span class="anchor" id="line-1-11"></span><div class="highlight python3"><div class="codearea" dir="ltr" lang="en">
<script type="text/javascript">
document.write('<a href="#" onclick="return togglenumber(\'CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4\', 1, 1);" \
class="codenumbers">Toggle line numbers<\/a>');
</script>
<pre dir="ltr" id="CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4" lang="en"><span class="line"><span class="LineNumber"><a href="#CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_1"> 1</a> </span><span class="LineAnchor" id="CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_1"></span><span class="anchor" id="line-1-12"></span><span class="ID">headers</span> = <span class="ID">requests</span>.<span class="ID">utils</span>.<span class="ID">default_headers</span>()</span>
<span class="line"><span class="LineNumber"><a href="#CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_2"> 2</a> </span><span class="LineAnchor" id="CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_2"></span><span class="anchor" id="line-2-3"></span><span class="ID">headers</span>.<span class="ID">update</span>({<span class="String">"</span><span class="String">User-Agent</span><span class="String">"</span>: <span class="String">"</span><span class="String">Mozilla/5.0</span><span class="String">"</span>})</span>
<span class="line"><span class="LineNumber"><a href="#CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_3"> 3</a> </span><span class="LineAnchor" id="CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_3"></span><span class="anchor" id="line-3-3"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_4"> 4</a> </span><span class="LineAnchor" id="CA-b6788d7319e0d92c151d0f8cbb2a4da4ad1607b4_4"></span><span class="anchor" id="line-4-3"></span><span class="ID">r</span> = <span class="ID">requests</span>.<span class="ID">get</span>(<span class="ID">url</span>, <span class="ID">headers</span>=<span class="ID">headers</span>)</span>
</pre></div></div><span class="anchor" id="line-115"></span><span class="anchor" id="line-116"></span><p class="line867">
<h3 id="Getting_nasty">Getting nasty</h3>
<span class="anchor" id="line-117"></span><p class="line874">Passare per Tor: <span class="anchor" id="line-118"></span><span class="anchor" id="line-119"></span><span class="anchor" id="line-120"></span><span class="anchor" id="line-121"></span><span class="anchor" id="line-122"></span><span class="anchor" id="line-123"></span><span class="anchor" id="line-124"></span><span class="anchor" id="line-125"></span><span class="anchor" id="line-126"></span><span class="anchor" id="line-127"></span><span class="anchor" id="line-128"></span><span class="anchor" id="line-129"></span><span class="anchor" id="line-130"></span><span class="anchor" id="line-131"></span><span class="anchor" id="line-132"></span><span class="anchor" id="line-133"></span><span class="anchor" id="line-134"></span><span class="anchor" id="line-1-13"></span><div class="highlight python3"><div class="codearea" dir="ltr" lang="en">
<script type="text/javascript">
document.write('<a href="#" onclick="return togglenumber(\'CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb\', 1, 1);" \
class="codenumbers">Toggle line numbers<\/a>');
</script>
<pre dir="ltr" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb" lang="en"><span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_1"> 1</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_1"></span><span class="anchor" id="line-1-14"></span><span class="ResWord">import</span> <span class="ID">socks</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_2"> 2</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_2"></span><span class="anchor" id="line-2-4"></span><span class="ResWord">import</span> <span class="ID">socket</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_3"> 3</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_3"></span><span class="anchor" id="line-3-4"></span><span class="ResWord">import</span> <span class="ID">requests</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_4"> 4</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_4"></span><span class="anchor" id="line-4-4"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_5"> 5</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_5"></span><span class="anchor" id="line-5-3"></span><span class="Comment"># Prima</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_6"> 6</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_6"></span><span class="anchor" id="line-6-3"></span><span class="ResWord">print</span>(<span class="ID">requests</span>.<span class="ID">get</span>(<span class="String">"</span><span class="String">http://icanhazip.com</span><span class="String">"</span>).<span class="ID">text</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_7"> 7</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_7"></span><span class="anchor" id="line-7-3"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_8"> 8</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_8"></span><span class="anchor" id="line-8-3"></span><span class="ID">socks</span>.<span class="ID">setdefaultproxy</span>(<span class="ID">proxy_type</span>=<span class="ID">socks</span>.<span class="ID">PROXY_TYPE_SOCKS5</span>,</span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_9"> 9</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_9"></span><span class="anchor" id="line-9-3"></span> <span class="ID">addr</span>=<span class="String">"</span><span class="String">127.0.0.1</span><span class="String">"</span>,</span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_10"> 10</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_10"></span><span class="anchor" id="line-10-3"></span> <span class="ID">port</span>=<span class="Number">9050</span>)</span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_11"> 11</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_11"></span><span class="anchor" id="line-11-3"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_12"> 12</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_12"></span><span class="anchor" id="line-12-3"></span><span class="ID">socket</span>.<span class="ID">socket</span> = <span class="ID">socks</span>.<span class="ID">socksocket</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_13"> 13</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_13"></span><span class="anchor" id="line-13-3"></span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_14"> 14</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_14"></span><span class="anchor" id="line-14-3"></span><span class="Comment"># Dopo</span></span>
<span class="line"><span class="LineNumber"><a href="#CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_15"> 15</a> </span><span class="LineAnchor" id="CA-1ec9fd47065049ca91a17443dce5f33537a9f7fb_15"></span><span class="anchor" id="line-15-2"></span><span class="ResWord">print</span>(<span class="ID">requests</span>.<span class="ID">get</span>(<span class="String">"</span><span class="String">http://icanhazip.com</span><span class="String">"</span>).<span class="ID">text</span>)</span>
</pre></div></div><span class="anchor" id="line-135"></span><span class="anchor" id="bottom"></span></div>
</div>
<hr>
Ultimo cambiamento: 18-03-2017
</body>
</html>