A technology blog about SAS programming and data mining

Owner: Chao_Huang

Listed in: Technology

Language: English

Report it

Site Statistics

Unique Visitors Today:
Page Views Today:
Unique Visitors this Week:
Page Views this Week:
Unique Visitors this Month:
Page Views this Month:
Total Unique Visitors:
Total Page Views:
Total Hits Out:
Traffic Chart

Latest Blog Posts for SAS Analysis

  • Two alternative ways to query large dataset in SAS
    on Jun 19, 2015 in python SAS
    I really appreciate those wonderful comments on my SAS posts by the readers (1, 2, 3). They gave me a lot of inspirations. Due to SAS or SQL’s inherent limitation, recently I feel difficult in deal with some extremely large SAS datasets (...
  • saslib: a simple Python tool to lookup SAS metadata
    on Jun 3, 2015 in python SAS
    saslib is an HTML report generator to lookup the metadata (or the head information) like PROC CONTENTS in SAS.It reads the sas7bdat files directly and quickly, and does not need SAS installed.Emulate PROC CONTENTS by jQuery and Da...
  • Deploy a minimal Spark cluster
    on Mar 20, 2015 in python spark
    RequirementsSince Spark is rapidly evolving, I need to deploy and maintain a minimal Spark cluster for the purpose of testing and prototyping. A public cloud is the best fit for my current demand. Intranet speedThe cluster should easily copy the data...
  • Solve the Top N questions in SAS/SQL
    on Feb 3, 2015 in SAS
    This is a following post after my previous post about SAS/SQL. SAS’s SQL procedure has a basic SQL syntax. I found that the most challenging work is to use PROC SQL to solve the TOP N (or TOP N by Group) questions. Comparing with other modern datab...
  • Deploy a MongoDB powered Flask app in 5 minutes
    on Feb 1, 2015 in python
    This is a quick tutorial to deploy a web service (a social network) by the LNMP (Linux, Nginx, MongoDB, Python) infrastructure on any IaaS cloud. The repo at Github is at StackThe stack is built on...
  • Spark practice(4): malicious web attack
    on Jan 8, 2015 in python spark
    Suppose there is a website tracking user activities to prevent robotic attack on the Internet. Please design an algorithm to identify user IDs that have more than 500 clicks within any given 10 minutes.Sample.txt: anonymousUserID timeStamp clickC...
  • Spark practice (3): clean and sort Social Security numbers
    on Dec 23, 2014 in python spark
    Sample.txtRequirements:1. separate valid SSN and invalid SSN2. count the number of valid SSN402-94-7709 283-90-3049 124-01-2425 1231232088-57-9593 905-60-3585 44-82-8341257581087327-84-0220402-94-7709ThoughtsSSN indexed data is commonly seen and stor...
  • Spark practice (2): query text using SQL
    on Dec 12, 2014 in python spark
    In a class of a few children, use SQL to find those who are male and weight over 100.class.txt (including Name Sex Age Height Weight)Alfred M 14 69.0 112.5 Alice F 13 56.5 84.0 Barbara F 13 65.3 98.0 Carol F 14 62.8 102.5 Henry M 14 63.5 102.5 James...
  • Spark practice (1): find the stranger that shares the most friends with me
    on Dec 7, 2014 in python spark
    Given the friend pairs in the sample text below (each line contains two people who are friends), find the stranger that shares the most friends with me.sample.txtme AliceHenry meHenry Aliceme JaneAlice JohnJane JohnJudy Aliceme MaryMary JoyceJoyce He...
  • Use a vector to print Pascal's triangle in SAS
    on Dec 5, 2014 in SAS
    Yesterday Rick Wicklin showed a cool SAS/IML function to use a matrix and print a Pascal’s triangle. I come up with an alternative solution by using a vector in SAS/IML.MethodTwo functions are used, including a main function PascalRule and a helper...
Loading Comments...


{ds_PageTotalItemCount} commentcomments

Post a Comment