Scripting

Question

In this assignment you will read through and parse a file with text and numbers. You will extract all the numbers in the file and compute the sum of the numbers.

import re
temp = open(“test.txt”,”r”)
#temp = “sumit tokel 23 11 amit tokle 50”
temp = temp.read()
num = re.findall(“[0-9]+”,temp)
num = [int(i) for i in num]
print sum(num)

Regular expression

^ Matches the beginning of a line
$ Matches the end of the line
. Matches any character
\s Matches whitespace
\S Matches any non-whitespace character
* Repeats a character zero or more times
*? Repeats a character zero or more times (non-greedy)
+ Repeats a character one or more times
+? Repeats a character one or more times (non-greedy)
[aeiou] Matches a single character in the listed set
[^XYZ] Matches a single character not in the listed set
[a-z0-9] The set of characters can include a range
( Indicates where string extraction is to start
) Indicates where string extraction is to end

Sample code to Create Socket

import socket
 mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)     //Create local endpoint
 mysock.connect(('www.py4inf.com', 80))                                                   //Initiate connections
 mysock.send('GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n')

while True:
 data = mysock.recv(512)
 if(len(data) < 1 ) :
 break
 print data
 mysock.close()

Question:

Read the file from website http://www.py4inf.com/code/romeo.txt and count the frequency of each word in file. Use urllib package while opening file.

import urllib
 fhand = urllib.urlopen('http://www.py4inf.com/code/romeo.txt')

counts = dict()
 if line in fhand:
 word = line.split()
 for word in words:
 counts[word] = counts.get(word,0) + 1
 print counts

# Note – this code must run in Python 2.x and you must download
# http://www.pythonlearn.com/code/BeautifulSoup.py
# Into the same folder as this program

import urllib
 from BeautufulSoup import *

url = raw_input("Enter - ")

html = urllib.urlopen(url).read()
 soup = BeautifulSoup(html)

# Retrieve a list of the anchor tags
 # Each tag is like a dictionary of HTML attributes

tags = soup("a")

for tag in tags:
 print tag.get('href', None)

Question:
The program will use urllib to read the HTML from the data files below, and parse the data, extracting numbers and compute the sum of the numbers in the file. You are to find all the tags in the file and pull out the numbers from the tag and sum the numbers.

import urllib
 from BeautifulSoup import *

url = raw_input("Enter - ")

html = urllib.urlopen(url).read()
 soup = BeautifulSoup(html)

total = 0
 tags = soup("span")

for tag in tags:
 total = total + int(tag.contents[0])
 #print tag.get('', None)
 print total

In this assignment you will write a Python program that expands on http://www.pythonlearn.com/code/urllinks.py. The program will use urllib to read the HTML from the data files below, extract the href= vaues from the anchor tags, scan for a tag that is in a particular position relative to the first name in the list, follow that link and repeat the process a number of times and report the last name you find.

Start at: http://python-data.dr-chuck.net/known_by_Harry.html
Find the link at position 18 (the first name is 1). Follow that link. Repeat this process 7 times. The answer is the last name that you retrieve.
Hint: The first character of the name of the last page that you will load is: M

import urllib
from BeautifulSoup import *

url = raw_input('Enter - ')
html = urllib.urlopen(url).read()


count = raw_input("Enter count: ")
positoon = raw_input("Enter position: ")

print "Retrieving: %s " % url

while count > 0:
    soup = BeautifulSoup(html)
    tags = soup('a')
    name = tags[int(positoon)-1]
    print "Retrieving: %s " % name.get('href',None)
    url = name.get('href',None)
    html = urllib.urlopen(url).read()

    count = int(count) - 1


new_url = name.get('href',None)
name = re.search('(.*)_(.*).html', new_url)
print name.group(2)


Extracting Data from XML:

In this assignment you will write a Python program somewhat similar to http://www.pythonlearn.com/code/geoxml.py. The program will prompt for a URL, read the XML data from that URL using urllib and then parse and extract the comment counts from the XML data, compute the sum of the numbers in the file.

We provide two files for this assignment. One is a sample file where we give you the sum for your testing and the other is the actual data you need to process for the assignment.

To make the code a little simpler, you can use an XPath selector string to look through the entire tree of XML for any tag named ‘count’ with the following line of code:

counts = tree.findall('.//count')

Take a look at the Python ElementTree documentation and look for the supported XPath syntax for details. You could also work from the top of the XML down to the comments node and then loop through the child nodes of the comments node.

import urllib
import xml.etree.ElementTree as ET

serviceurl = 'http://maps.googleapis.com/maps/api/geocode/xml?'


urll = raw_input("Enter location: ")
url = urllib.urlopen(urll)
an = url.read()
print "Retrieving", urll
print 'Retrieved',len(an),'characters'
stuff = ET.fromstring(an)

sums = []
freq = 0
counts =  stuff.findall('.//count')
for count in counts:
    print count.text
    sums.append(int(count.text))
    freq = freq + 1
print "count: ", freq
print "sum: ", sum(sums)
------------------------------
Example for Google MAP API
------------------------------
import urllib
import json

# serviceurl = 'http://maps.googleapis.com/maps/api/geocode/json?'
serviceurl = 'http://python-data.dr-chuck.net/geojson?'

while True:
    address = raw_input('Enter location: ')
    if len(address) < 1 : break

    url = serviceurl + urllib.urlencode({'sensor':'false', 'address': address})
    print 'Retrieving', url
    uh = urllib.urlopen(url)
    data = uh.read()
    print 'Retrieved',len(data),'characters'

    try: js = json.loads(str(data))
    except: js = None
    if 'status' not in js or js['status'] != 'OK':
        print '==== Failure To Retrieve ===='
        print data
        continue

    print json.dumps(js, indent=4)

    lat = js["results"][0]["geometry"]["location"]["lat"]
    lng = js["results"][0]["geometry"]["location"]["lng"]
    print 'lat',lat,'lng',lng
    location = js['results'][0]['formatted_address']
    print location

JSON example:

In this assignment you will write a Python program somewhat similar to http://www.pythonlearn.com/code/json2.py. The program will prompt for a URL, read the JSON data from that URL using urllib and then parse and extract the comment counts from the JSON data, compute the sum of the numbers in the file and enter the sum below:

import urllib
import json


while True:
    url = raw_input('Enter location: ')
    if len(url) < 1 : break

    print 'Retrieving', url
    uh = urllib.urlopen(url)
    data = uh.read()
    print 'Retrieved',len(data),'characters'

    try: js = json.loads(str(data))
    except: js = None

    total = []
    count = 0
    for item in js["comments"]:
        num = item["count"]
        total.append(num)
        count = count + 1
    answer = sum(total)
    print "count: ", count
    print "sum: ", answer

This application will read the mailbox data (mbox.txt) count up the number email messages per organization (i.e. domain name of the email address) using a database with the following schema to maintain the counts.

CREATE TABLE Counts (org TEXT, count INTEGER

Input file: http://www.pythonlearn.com/code/mbox.txt

Link to the similar type of code: http://www.pythonlearn.com/code/emaildb.py

Download Database software from http://sqlitebrowser.org/

import sqlite3
import re

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()
cur.execute(''' DROP TABLE IF EXISTS Counts''')

cur.execute(''' CREATE TABLE Counts (org TEXT, count INTEGER) ''')

fname = raw_input('Enter file name: ')
if (len(fname) < 1) : fname = 'mbox.txt'
fh = open(fname)
for line in fh:
    m = re.search('From:\s[a-zA-Z0-9_.+-]+@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)',line)
    if m:
        domain = m.group(1)
        cur.execute('SELECT count from Counts WHERE org = ?', (domain,))
        row = cur.fetchone()
        if row is None:
            cur.execute('''INSERT INTO Counts (org, count)VALUES ( ?, 1 )''', ( domain, ) )
        else:
            cur.execute('UPDATE Counts SET count=count+1 WHERE org = ?', (domain, ))
        conn.commit()

sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'

print
print "Counts:"
for row in cur.execute(sqlstr) :
    print str(row[0]), row[1]


cur.close()

Event Handler programming

simplegui.create_timer(mili_second, handler_or_function_name)

import simplegui
# Event handler
def tick():
    print "tick!"
# Register handler
timer = simplegui.create_timer(1000, tick)
# Start timer
timer.start()

Example 2:

import simplegui
message = "Welcome!"

# Handler for mouse click
def click():
    global message
    message = "Good job!"

# Handler to draw on canvas
def draw(canvas):
    canvas.draw_text(message, [50,112], 36, "Red")

# Create a frame and assign callbacks to event handlers
frame = simplegui.create_frame("Home", 300, 200)
frame.add_button("Click me", click)
frame.set_draw_handler(draw)

# Start the frame animation
frame.start()

The class is like building blue print where we can call building information like length, color, height or we can create another building of some different color.

When we call the function then that specific function gets called but when we create instance of class, we call _init_ function and it will create all memory in block.

Very good video:https://www.youtube.com/watch?v=d4RZ6fIqKOg

This program will create many square

import turtle

def draw_square(some_turtle):
    for i in range(1,5):
        some_turtle.forward(100)
        some_turtle.right(90)

def draw_art():
    window = turtle.Screen()
    window.bgcolor("red")
    #create the turtle brad - Draw a square
    brad = turtle.Turtle()
    brad.shape("turtle")
    brad.color("yellow")
    brad.speed(2)
    for i in range(1,37):
        draw_square(brad)
        brad.right(10)
    #Create the turtle Angie - Draw a circle
    #angie = turtle.Turtle()
    #angie.shape("arrow")
    #angie.color("blue")
    #angie.circle(100)
    window.exitonclick()
draw_art()

There are many python library that does not come from standard python library such as “twilio”.  You can find the most download Python packages on the PyPI Ranking page.

Program is to find if there is any profanity work in text file

import urllib
def read_text():
    quotes = open("c:\movie_quotes.txt")
    contents_of_file = quotes.read()
    quotes.close()
    chech_profanity(contents_of_file)

def chech_profanity(text_to_check):
    connection = urllib.urlopen("http://www.wdyl.com/profanity?q="+text_to_check)
    output = connection.read()
    if "true" in output:
        print ("Profanity Alert!!!")
    elif "false" in output:
        print ("This document has no curse words!")
    else:
        print ("could not scan the documnet properly")
    connection.close()
read_text

Screen Shot 2017-02-13 at 00.02.08.png

Parent Child classes:

Screen Shot 2017-02-13 at 00.42.43.png

Method overriding: If the method with same name present in child class then that method is being used by object.

Screen Shot 2017-02-13 at 00.53.31.png

Lets say we want to monitor the prices in jobnlewis website, we will open website and find price content code in html as this “# $332 ” This line indicate the price list. We will write program to connect to the website, parse HTML code and find the price as below

import requests
from bs4 import BeautifulSoup

request = request.get("http://johnlewis.com/john-lewis-wade-office-chair-black/p447855")
content = request.content
soup = BeautifulSoup(content, "html.parser")
element = soup.find("span", {"itemprop": "price", "class": "now-price"})
print(element.text.strip())

MONGODB:

1) To start mongodb, execute command mongodb
2) In another terminal, execute command mongo
3) show dbs: To see the databse
4) use fullstack: To create new database
5) show collections:
6) db.students.insert({“name”:”jose”, “mark”:99}) : It will insert data in fullstack db
6) db.students.find({}) : It will find everything in database
7) db.students.insert({“item”: “chair”, “price”: 999, “age”: 25})
8) db.students.remove({“item”: “chair”}) : It will remove the specific entry in database

import pymongo
uri : "mongodb://127.0.0.1:27017"
client = pymongo.MongoClient(uri)
database =  client['fullstack']
collection = database['students']

students = collection.find({})

for student in students:
    print(student)





----------database.py--------
we are not using "self" while defining function, there we are using @staticmethod to call all the object from outside function

import pymongo

class Database(object):
    URI = "mongodb://127.0.0.1:27017"
    DATABASE = None

    @staticmethod
    def initialize():
        client = pymongo.MongoClient(Database.URI)
        Database.DATABASE = client['fullstack']

    @staticmethod
    def insert(collection, data):
        Database.DATABASE[collection].insert(data)

    @staticmethod
    def find(collection, query):
        return Database.DATABASE[collection].find(query)

    @staticmethod
    def find_one(collection, query):
        return Database.DATABASE[collection].find_one(query)

-----------------------post.py-----------------

from database import Database
import uudi
import datetime     
class Post(object):
    def __init__(self, blog_id, title, content, author, data=datetime.datetime.utcnow(), id=None):
        self.blog_id = blog_id
        self.title = title
        self.content = content
        self.author = author
        self.created_data = data
        self.id = uuid.uuid4().hex if id is None else id // uuid.uuid4().hex will generate the random id, 4 indicate random value. if id is none then it will generate random id else it will use passed id number 

    def save_to_mongo(self):
        Database.insert(collection = 'posts', data=self.json())


    def json(self):
        return {
        'id' : self.id,
        'blog_id' : self.blog_id,
        'author': self.author,
        'content': self.content,
        'title': self.title,
        'created_date': self.created_date
        }

    @staticmethod
    def from_mongo(id):
        return Database.find_one(collection="posts", query = {'id': id})

    @staticmethod
    def from_blog(id):
        return [post for post in Database.find(collection="posts", query = {'blog_id': id})]


---------------app.py---------------------

from database import Database
from models.post import Post

Database.initialize()

post = Post(blog_id = "123", title="Another great post", content="This is same sample content", author= "Jose")

post.save_to_mongo()