<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>SASBI.net</title>
	<atom:link href="http://sasbi.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://sasbi.net</link>
	<description>STATS and Business Intelligence</description>
	<pubDate>Sun, 20 May 2012 15:09:39 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Learning Credit Scoring</title>
		<link>http://sasbi.net/learning-credit-scoring/</link>
		<comments>http://sasbi.net/learning-credit-scoring/#comments</comments>
		<pubDate>Sun, 20 May 2012 15:09:39 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Books]]></category>

		<category><![CDATA[Credit Scoring]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=335</guid>
		<description><![CDATA[I recommend four books to learn how to develop Credit Scoring cards.]]></description>
			<content:encoded><![CDATA[<table border='0'>
<tr>
<td>Colleague from a friendly department wants to join our scoring department. He lacks knowledge on scorecard development and I gave him my copy of <a href="http://www.amazon.com/Credit-Scoring-Risk-Managers-Handbook/dp/1450578969/ref=sr_1_1?ie=UTF8&#038;qid=1335863827&#038;sr=8-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');">Credit Scoring for Risk Managers</a> and recommended <a href="http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29" onclick="javascript:pageTracker._trackPageview('/outbound/article/archive.ics.uci.edu');">German Credit</a> data set to practice. I also recommend <a href="http://www.amazon.com/The-Credit-Scoring-Toolkit-Management/dp/0199226407/ref=sr_1_3?ie=UTF8&#038;qid=1335866032&#038;sr=8-3" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');">Credit Risk Scorecards</a>. Though the book is not as good as the first one it is always interesting to compare two authors describing the same problem.</td>
<td></td>
<td>
<a href="http://www.amazon.com/Credit-Scoring-Risk-Managers-Handbook/dp/1450578969/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335872807&#038;sr=1-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');"><img src="http://www.sasbi.pro/wp-content/uploads/2012/05/credit-scoring-for-risk-managers.bmp" alt="credit scoring for risk managers" title="credit scoring for risk managers" width="92" height="140" /></a>
</td>
<td>
<a href="http://www.amazon.com/Credit-Risk-Scorecards-Implementing-Intelligent/dp/047175451X/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335872855&#038;sr=1-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');"><img src="http://www.sasbi.pro/wp-content/uploads/2012/05/credit-risk-scorecards1.jpg" alt="credit risk scorecards" title="credit risk scorecards " width="92" height="140" class="alignnone size-full wp-image-1970" /></a>
</td>
</tr>
</table>
<p><span id="more-335"></span></p>
<table>
<tr>
<td>
<a href="http://www.amazon.com/The-Credit-Scoring-Toolkit-Management/dp/0199226407/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335872905&#038;sr=1-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');"><img src="http://www.sasbi.pro/wp-content/uploads/2009/01/the-credit-scoring-toolkit.jpg" alt="the credit scoring toolkit" title="the credit scoring toolkit" width="106" height="140" class="alignnone size-full wp-image-1987" /></a>
</td>
<td>
<a ref='http://www.amazon.com/Applied-logistic-regression-probability-statistics/dp/0471356328/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335872943&#038;sr=1-1'><img src="http://www.sasbi.pro/wp-content/uploads/2009/01/applied-logistic-regression.bmp" alt="applied logistic regression" title="applied logistic regression" width="88" height="140" class="alignnone size-full wp-image-1993" /></a>
</td>
<td>Later one may read through the <a href="http://www.amazon.com/The-Credit-Scoring-Toolkit-Management/dp/0199226407/ref=sr_1_3?ie=UTF8&#038;qid=1335866032&#038;sr=8-3" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');">Credit Scoring Toolkit</a> and work through the <a href="http://www.amazon.com/Applied-logistic-regression-probability-statistics/dp/0471356328/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335866112&#038;sr=1-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');">Applied Logistic Regression</a>. Though it is written for scientists it is a handbook for score card developers in two main Russian banks. In order to get used to SAS the <a href="http://www.amazon.com/The-Little-SAS-Book-Edition/dp/1599947250/ref=sr_1_1?s=books&#038;ie=UTF8&#038;qid=1335866177&#038;sr=1-1" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.amazon.com');">The Little SAS Book</a>is the best choice. Scoring in R is well described in the <a href="http://cran.r-project.org/doc/contrib/Sharma-CreditScoring.pdf" onclick="javascript:pageTracker._trackPageview('/outbound/article/cran.r-project.org');">Guide to Credit Scoring in R</a>.</td>
</tr>
</table>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/learning-credit-scoring/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Exploring DWH</title>
		<link>http://sasbi.net/exploring-dwh/</link>
		<comments>http://sasbi.net/exploring-dwh/#comments</comments>
		<pubDate>Tue, 01 May 2012 15:42:58 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Data Warehouse]]></category>

		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=317</guid>
		<description><![CDATA[Usually it happens when changing jobs. One has to get used to the new DWH and start creating reports and ABT’s as soon as possible. As a rule there is no documentation thus one has to find out what the key columns are and how to merge tables himself. The good news is that programmer’s [...]]]></description>
			<content:encoded><![CDATA[<p>Usually it happens when changing jobs. One has to get used to the new DWH and start creating reports and ABT’s as soon as possible. As a rule there is no documentation thus one has to find out what the key columns are and how to merge tables himself. The good news is that programmer’s and administrator’s qualification increases over time and they have started developing DWH schemas using Visio, ERwin etc.</p>
<p><img src="http://sasbi.net/wp-content/uploads/2012/05/dwh-schema.png" alt="dwh-schema" title="dwh-schema" width="480" height="480" class="alignnone size-full wp-image-319" /><br />
<span id="more-317"></span></p>
<p>This article shows some simple queries that help in DWH exploration and a tool that visualizes key columns. The first query shows list of all tables:</p>
<pre class="brush: text">
/* SAS */
proc sql;
  select *
  from dictionary.tables;
quit;

/* Oracle */
select *
from all_tables

/* MS SQL */
select *
from sys.Tables
</pre>
<p>The second query shows list of all columns:</p>
<pre class="brush: text">
/* SAS */
proc sql;
  select *
  from dictionary.columns;
quit;

/* Oracle */
select *
from all_tab_columns

/* MS SQL */
select *
from information_schema.columns
</pre>
<p>The third example is useful for queries debugging. To make them faster one can limit the number of rows used in a query:</p>
<pre class="brush: text">
/* SAS */
proc sql;
  select *
  from example(obs = 10);
quit;

/* SAS */
proc sql outobs = 10;
  select *
  from example;
quit;

/* Oracle */
select *
from example
where rownum &lt; 10

/* MS SQL */
select top 10 *
from example
</pre>
<p>Having a list of columns from the second query one can find columns having same names. Usually key columns in a “good” DWH are named equally. I used R to makes a graph of table links based on column names. Files with data and code can be found in “<a href="http://sasbi.net/download/" >Download”</a> section.</p>
<p>You will need to install Rgraphviz library as well as windows libraries from <a href="http://www.graphviz.org/pub/graphviz/stable/windows/graphviz-2.20.3.1.msi" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.graphviz.org');">graphviz-2.20.3.1.msi</a>. Also make sure that Rgraphviz bin catalog is in your system path variable.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/exploring-dwh/feed/</wfw:commentRss>
		</item>
		<item>
		<title>R-project. Plots</title>
		<link>http://sasbi.net/r-project-plots/</link>
		<comments>http://sasbi.net/r-project-plots/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 18:52:09 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[R-project]]></category>

		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=305</guid>
		<description><![CDATA[Good analyst knows how to deliver his investigation’s results. He makes good presentations with excellent plots and schemas. Unfortunately SAS plots are not that good. Thus some analysts make plots in Excel. But Excel is not as good as R that allows modification of any element of the plot.



Recently I’ve practiced to make plots in [...]]]></description>
			<content:encoded><![CDATA[<p>Good analyst knows how to deliver his investigation’s results. He makes good presentations with excellent plots and schemas. Unfortunately SAS plots are not that good. Thus some analysts make plots in Excel. But Excel is not as good as R that allows modification of any element of the plot.</p>
<p><iframe width="490" height="350" src="http://www.youtube.com/embed/CpxXCU0Ocwc" frameborder="0" allowfullscreen></iframe><br />
<br />
<span id="more-305"></span></p>
<p>Recently I’ve practiced to make plots in R using data on <a href="http://www.sql.ru/forum/actualthread.aspx?tid=928116" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.sql.ru');">weight loss competition</a> among members of SQL.ru forum and made a good tutorial. Files with data and code are available in <a href="http://sasbi.net/download/" >download</a> section. There are detailed comments and commands are intuitive.
<p>
<img src="http://sasbi.net/wp-content/uploads/2012/04/weight.jpg" alt="weight" title="weight" width="792" height="377" class="alignnone size-full wp-image-306" /></p>
<p>One’ll need a forecast library that can be installed using install.packaes(‘forecast’) command or download it from R-project.org and save it in the C:\Program Files\R\R-2.12.2\library\ catalog. Help is available using “?” symbol, for example “?plot”. I’ll give you a bottle of wine if you make the same plot using any other software.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/r-project-plots/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Social Network Analysis</title>
		<link>http://sasbi.net/social-network-analysis/</link>
		<comments>http://sasbi.net/social-network-analysis/#comments</comments>
		<pubDate>Sun, 14 Aug 2011 15:54:21 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Data Mining]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=296</guid>
		<description><![CDATA[Social Network Analysis uses graphs to understand relationships between people.]]></description>
			<content:encoded><![CDATA[<p>One of the newest fields in data mining is Social Network Analysis (SNA). The task is to find out your friends (first circle), then friends of your friends (second circle) etc. Mathematicians call it “to develop a graph” made of nodes (the people) and edges (ties between people).</p>
<p><iframe title="YouTube video player" width="489" height="390" src="http://www.youtube.com/embed/oLto_eY03rg" frameborder="0" allowfullscreen></iframe></p>
<p>For example in Telecom graphs can be built using phone calls data. The people you call are your first circle. They are relatives, colleagues or friends. You value those people and listen to their opinions. If one of your friends uses mobile internet the telecom operator can offer this service to you with a high probability of purchase.</p>
<p><span id="more-296"></span></p>
<p>Social networks like Facebook can find out your first circle using your “friends list” or monitoring the personal pages you visit. The advertising you saw on Facebook could be shown to you because one of your friends clicked on it earlier.</p>
<p>Social Networks are also important in debt collection. The colleagues and friends can influence the debtor and make him pay the debt. This is why banks and collection agencies do actively collect contact information of your friends, neighbors and colleagues. It sometimes happens that the debt is payed by the friends or relatives, not the debtor.</p>
<p>For software companies their experts are the most valued asset. Every person in the company should have access to the expert’s knowledge. The social network graph can show whether the expert is actively helping other colleagues or is he isolated from others. This graph can use data on internal mail and phone conversations.</p>
<p>For example the graph above is based on internet forum <a href="http://www.sql.ru/forum/actualtopics.aspx?bid=26" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.sql.ru');">SQL.ru&nbsp;&rarr; OLAP and DWH</a>. The nodes are the forum members and the edges show whether the member took part in other member’s thread. Every edge has a weight that equals the number of member A posts in member’s B thread plus number of member B posts in member’s A thread.</p>
<p>At first I made a list of all the 3 000+ forum members and added edges. The graph looked like a black spot on a monitor. I removed all the edges with the weights less than 10 and deleted all the members left without edges. That is the last graph in the video. Then I continued to delete edges with a minimal weights till there was only one edge left. That is the first graph in the video. Then I put the graphs in the video in the reverse order, starting with the smallest graph to the biggest one.</p>
<p>The video bellow is based on the forum <a href="http://www.sql.ru/forum/actualtopics.aspx?bid=16" onclick="javascript:pageTracker._trackPageview('/outbound/article/www.sql.ru');">SQL.ru&nbsp;&rarr; Просто треп</a> (just&nbsp;chat).</p>
<p><iframe title="YouTube video player" width="489" height="390" src="http://www.youtube.com/embed/hSKW0EoImks" frameborder="0" allowfullscreen></iframe></p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/social-network-analysis/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Banks and Credit bureaus</title>
		<link>http://sasbi.net/banks-and-credit-bureaus/</link>
		<comments>http://sasbi.net/banks-and-credit-bureaus/#comments</comments>
		<pubDate>Sun, 07 Aug 2011 08:16:49 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Text Mining]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=289</guid>
		<description><![CDATA[According to Russian legislation every bank has to report to one of the credit bureaus (CB). The reported credit histories (CH) contain info on credit amount, monthly payments and other information. Any bank can request your credit information to assess consumer credit worthiness and decide whether to issue you another loan or not.

Consumers with good [...]]]></description>
			<content:encoded><![CDATA[<p>According to <a href="http://base.consultant.ru/cons/cgi/online.cgi?req=doc;base=LAW;n=70212" onclick="javascript:pageTracker._trackPageview('/outbound/article/base.consultant.ru');">Russian legislation</a> every bank has to report to one of the credit bureaus (CB). The reported credit histories (CH) contain info on credit amount, monthly payments and other information. Any bank can request your credit information to assess consumer credit worthiness and decide whether to issue you another loan or not.</p>
<p><img src="http://sasbi.net/wp-content/uploads/2011/08/bank_cb.png" alt="banks and credit bureaus" title="banks and credit bureaus" width="480" height="633"/></p>
<p>Consumers with good credit histories can get a new loan with a lower interest rate. But one has to know which CB stores its credit history and what banks can request that history from CB. If your credit history is poor for example you had delinquent loans you better look for a bank that don’t request your credit history.</p>
<p><span id="more-289"></span></p>
<p>Russian Central Bank (RCB) <a href="http://ckki.www.cbr.ru/?m_ParsSelectorState=1" onclick="javascript:pageTracker._trackPageview('/outbound/article/ckki.www.cbr.ru');">web site</a> allows finding the CB’s where ones histories are stored. According to RCB web site there are 800+ credit organizations and 30+ credit bureaus in Russia. Most of the credit histories are stored in the five biggest CB’s: Equifax, Expirian-Interfax, NBKI, Infocredit and MBKI.</p>
<p>Banks don’t like to publish the list of CB’s they work with. But some information is available online. My task was to develop the schema of banks and credit bureaus cooperation. For instance if one sentence contains both bank and credit bureau names it is very probable they do exchange information with each other. But I had to exclude all the sentences containing more than one bank or CB names because that sentence can be a list of some forum participants.</p>
<p>The schema above was developed using 5 000+ html pages that contain at least one of the five biggest CB’s. Unfortunately schema doesn’t show the direction of the information exchange. The bank can request information from one CB and report credit histories to the other. The calculation of data exchange direction is a next task as well as the increase of the number of banks and CB’s.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/banks-and-credit-bureaus/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>

