<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>SASBI.net</title>
	<atom:link href="http://sasbi.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://sasbi.net</link>
	<description>SAS Business Intelligence Secrets</description>
	<pubDate>Sat, 27 Mar 2010 07:39:17 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Oracle comments</title>
		<link>http://sasbi.net/oracle-comments/</link>
		<comments>http://sasbi.net/oracle-comments/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 07:39:17 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=192</guid>
		<description><![CDATA[There are two things that annoy me when coping data from other DBs into SAS. One can copy neither indexes nor field’s labels/comments. I don’t know what to do with indexes but there is a workaround for labels/comments.
The problem occurred after creation of the decision tree schema for presentation. There were no labels in the [...]]]></description>
			<content:encoded><![CDATA[<p>There are two things that annoy me when coping data from other DBs into SAS. One can copy neither indexes nor field’s labels/comments. I don’t know what to do with indexes but there is a workaround for labels/comments.</p>
<p>The problem occurred after creation of the decision tree schema for presentation. There were no labels in the table copied from Oracle. Thus schema contained odd variable names instead of clear descriptions. But there were comments in Oracle but they were not copied with the data.</p>
<p>We decided to copy comments into a separate table and use a data step to set labels in the table with data.</p>
<p><span id="more-192"></span></p>
<p>Oracle (like SAS) has system tables that contain information about all Oracle tables. For instance columns comments are available in the SYS.ALL_COL_COMMENTS table. Knowing the schema name (similar to SAS library) and the table name one can get comments using pass-through facility:</p>
<pre class="brush: text">
proc sql;
  connect to odbc as Oracle(datasrc=&quot;Oracle&quot; user=&quot;***&quot; password=&quot;***&quot;);

  create table work.labels as
  select *
  from connection to Oracle(
    select *
    from sys.all_col_comments t
    where t.owner = &#039;&lt;имя схемы&gt;&#039;
    and t.table_name = &#039;&lt;имя таблицы&gt;&#039;
  );

  disconnect from Oracle;
quit;</pre>
<p>The WORK.LABELS table contains the following columns:</p>
<ul>
<li>OWNER – owner (creator) of the Oracle schema,</li>
<li>TABLE_NAME,</li>
<li>COLUMN_NAME,</li>
<li>COMMENTS.</li>
</ul>
<p>One can use label statement in a data step to set up comments:</p>
<pre class="brush: text">
label &lt;column name&gt;=’&lt;comments&gt;’</pre>
<p>We decided to automate this process with a data step that reads WORK.LABELS table and creates c:\update_labels.sas file containing all the label statements.</p>
<pre class="brush: text">
data _null_;
  file &quot;c:\update_labels.sas&quot;;
  set &lt;name of table with data&gt; end = last;
  COMMENTS = compress(COMMENTS,&quot;&#039;&quot;);

  if _N_ = 1 then do;
    put &quot;data work.data_with_labels;&quot;;
	put &quot;  set _temp;&quot;;
  end;

  put &quot; label &quot;column_name&quot; = &#039;&quot;COMMENTS&quot;&#039;;&quot;;

  if last then put &quot;run;&quot;;
run;</pre>
<p>Next we run the code:</p>
<pre class="brush: text">
%INCLUDE &quot;c:\update_labels.sas&quot;;</pre>
<p>And delete the file:</p>
<pre class="brush: text">
options noxwait;
x &#039;del c:\update_labels.sas&#039;;
</pre>
<p>As a results WORK.DATA_WITH_LABELS table has labels that we downloaded from Oracle.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/oracle-comments/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Presenting decision tree</title>
		<link>http://sasbi.net/presenting-decision-tree/</link>
		<comments>http://sasbi.net/presenting-decision-tree/#comments</comments>
		<pubDate>Sun, 14 Mar 2010 08:09:03 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Enterprise Miner]]></category>

		<category><![CDATA[Presenting results]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=185</guid>
		<description><![CDATA[Presenting data analysis results to the business people is one of the hardest tasks. One has to avoid technical terms and explain math concepts in simple and clear language. Recently I faced this problem again having to present decision tree model to my director.


The best idea I came up with was to print the decision [...]]]></description>
			<content:encoded><![CDATA[<p>Presenting data analysis results to the business people is one of the hardest tasks. One has to avoid technical terms and explain math concepts in simple and clear language. Recently I faced this problem again having to present decision tree model to my director.
</p>
<p>
The best idea I came up with was to print the decision tree on A2 size paper. The idea worked well because the print attracted a lot of attention and provoked a long discussion. This article is devoted to the tricks to improve the decision tree schema that is created by the SAS Enterprise Miner Decision Tree node.</p>
<p><span id="more-185"></span></p>
<h3>Printing</h3>
<p>One can print decision tree from the Results window but SAS automatically scales the image to fit it on one page. If you don’t have a large format printer available all the legends on your print will be too small to read.</p>
<p>I didn’t have large format printer either. I copied the decision tree image (open Results windows and go to Edit &rarr; Copy), pasted it into the Paint editor and saved it as a file. Then I inserted the file into Excel that allows 1:1 scale printing spreading the image across multiple pages.</p>
<p>But even Excel leaves empty margins on a list. Having printed the image I cut off the margins with a paper knife and a metal ruler.</p>
<h3>Title and footnote</h3>
<p>You can add title and footnote either in Paint or SAS. In SAS open the Results windows, right click on the tree image and choose Graph properties &rarr; Title/Footnote. For instance, you can add the following:</p>
<ul>
<li>Title: Application Scoring v1.0</li>
<li>Subtitle: on clients’ data collected January 1 - June 1, 2009.</li>
<li>Footnote: “1” – client responded in less than 3 months, “0” – did not.</li>
</ul>
<p>I also recommend changing background color to white in the Graph menu. That will save some printer ink and make your image brighter.</p>
<h3>Labels</h3>
<p>If your input data set doesn’t have labels you will see variables names on the decision tree. They are often uninformative. If the labels are present the names are replaced with labels automatically. You can add or replace labels the following way:</p>
<ul>
<li>Add Transform Variables node to the diagram.</li>
<li>Place it right after the Data Source node.</li>
<li>Click SAS Code button on the left and print the following code in the window:</li>
</ul>
<pre class="brush: text">
label &lt;variable name&gt; = “&lt;description&gt;”;
</pre>
<p>Label is the Base SAS operator that adds descriptions (labels) to the data set variables. After running the entire diagram path you will see variables labels on the decision tree image.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/presenting-decision-tree/feed/</wfw:commentRss>
		</item>
		<item>
		<title>SAS/Macro. Data step</title>
		<link>http://sasbi.net/sasmacro-data-step/</link>
		<comments>http://sasbi.net/sasmacro-data-step/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 16:38:27 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=178</guid>
		<description><![CDATA[SAS/Macro can reduce your data step code and make it easy to maintain.]]></description>
			<content:encoded><![CDATA[<p>Recently I worked on a data step that creates additional variables in a table that was downloaded from a data warehouse. Those variables were created for data analysis purposes. For instance calculated account age at delinquency can be more important than account start date or date of delinquency.</p>
<p><span id="more-178"></span></p>
<p>Besides account age at delinquency there were many similar variables like the number of days:</p>
<ul>
<li>since delinquency till the first call to the debtor,</li>
<li>since first call till the first promise to pay the debt,</li>
<li>since first promise to pay till the first payment,</li>
<li>etc.</li>
</ul>
<p>I wrote the following code to calculate the first variable:</p>
<pre class="brush: text">
if not( missing(account_start_date) or missing(delinquency_start_date) )
then days_from_ASD_to_DSD = intck(&#039;day&#039;,account_start_date,delinquency_start_date );
else call missing(days_from_ASD_to_DSD);
</pre>
<p>Then I started copping the code to calculate other variables. After the third iteration I realized that I was doing something wrong. The thing is that SAS/Macro allows creating macros and calling them in a data step. For instance the code above can be rewritten as a macro:</p>
<pre class="brush: text">
%macro get_days_number(days_number, first_date, last_date);
  if not( missing(&amp;first_date) or missing(&amp;last_date) )
  then &amp;days_number = intck(&#039;day&#039;, &amp;first_date, &amp;last_date);
  else call missing(&amp;days_number);
%mend;
</pre>
<p>One has to run it before the data step to save it in a work library to make it available in a subsequent data step. Then you can run the macro in a data step the following way:</p>
<pre class="brush: text">
%get_days_number(days_from_ASD_to_DSD, account_start_date, delinquency_start_date);
</pre>
<p>This macro helped me to reduce the code by three times. Moreover if the algorithm changes I will have to fix the macro only.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/sasmacro-data-step/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Maximum levels of 512</title>
		<link>http://sasbi.net/maximum-levels-of-512/</link>
		<comments>http://sasbi.net/maximum-levels-of-512/#comments</comments>
		<pubDate>Sun, 28 Feb 2010 14:35:45 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[Enterprise Miner]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=170</guid>
		<description><![CDATA[Working with Enterprise Miner I do often encounter an error “Maximum levels of 512 exceeded”. It occurs after adding new table to the project during the execution of one of the model nodes like decision tree or logistic regression. The reason is that one of the nominal variables has more than 512 different values.

SAS offers [...]]]></description>
			<content:encoded><![CDATA[<p>Working with Enterprise Miner I do often encounter an error “Maximum levels of 512 exceeded”. It occurs after adding new table to the project during the execution of one of the model nodes like decision tree or logistic regression. The reason is that one of the nominal variables has more than 512 different values.</p>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/maximum_levels_of_512_exceeded.png" alt="maximum levels of 512 exceeded" title="maximum levels of 512 exceeded" class="alignnone size-full wp-image-172" /></p>
<p><a href="http://support.sas.com/kb/20/054.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/support.sas.com');">SAS offers to workarounds:</a></p>
<p><span id="more-170"></span></p>
<ol>
<li>Put %let EM_TRAIN_MAXLEVELS=n; in the project SAS code that executes before the project starts. But the risk you face is that the interval variable was interpreted as nominal. In this case the results are unpredictable: from ignoring this variable in the model (decision trees) to inability to develop a model (regression with fixed number of vars).</li>
<li>Redefine the variable role as “interval” if it is really an interval variable like date.</li>
</ol>
<p>There is also another workaround. One can reduce the number of different values by grouping them. But in order to use the last two advices one has to know which variable causes the error. To define the variable one can add stat explore node to the input data node and run it. I recommend limiting the number of variable to use to nominal variables only (set role “rejected” to any other variable).</p>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/stat_explore.png" alt="stat explore node" title="stat explore node" class="alignnone size-full wp-image-174" /></p>
<p>Besides other statistics stat explore calculates the number of levels in nominal variables. For instance ZIP variable from cup98lrn data set contains 19 938 different values. One can eliminate this variable from consideration or create another variable that contains only first two to three digits of the ZIP code.</p>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/stat_explore_output.png" alt="stat explore output" title="stat explore output" class="alignnone size-full wp-image-175" /></p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/maximum-levels-of-512/feed/</wfw:commentRss>
		</item>
		<item>
		<title>ODBC driver. Excel client</title>
		<link>http://sasbi.net/odbc-driver-excel-client/</link>
		<comments>http://sasbi.net/odbc-driver-excel-client/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 13:06:29 +0000</pubDate>
		<dc:creator>Oleg Solovyev</dc:creator>
		
		<category><![CDATA[ODBC]]></category>

		<guid isPermaLink="false">http://sasbi.net/?p=153</guid>
		<description><![CDATA[


ODBC (Open DataBase Connectivity) is a technology to connect something (called client) to a database. It is widely used to download data to SAS from other DBs. But one can also use


ODBC to download data from SAS. For instance one can connect Excel to SAS to copy data from SAS into Excel skipping temporary CSV [...]]]></description>
			<content:encoded><![CDATA[<table>
<tr>
<td><img src="http://sasbi.net/wp-content/uploads/2010/02/sas_odbc_excel.png" alt="SAS ODBC connection schema" title="SAS ODBC connection schema" width="211" height="77" class="alignnone size-full wp-image-154" /></td>
<td>ODBC (Open DataBase Connectivity) is a technology to connect something (called client) to a database. It is widely used to download data to SAS from other DBs. But one can also use</td>
</tr>
</table>
<p style="margin-top : -3px;">ODBC to download data from SAS. For instance one can connect Excel to SAS to copy data from SAS into Excel skipping temporary CSV files or to use Excel as a DWH client.</p>
<p>There is an <a href="http://support.sas.com/techsup/technote/ts626.html" onclick="javascript:pageTracker._trackPageview('/outbound/article/support.sas.com');">instruction to configure SAS ODBC driver</a> available on SAS.com. We will follow the instruction and connect Excel to SAS.</p>
<p><span id="more-153"></span></p>
<h4>Driver installation</h4>
<p>First of all one has to download ODBC driver. The driver is free but registration on SAS.com is required. To download the driver go to SAS.com &rarr; Support &#038; Training &rarr; Support &rarr; Downloads &#038; Hot Fixes &rarr; SAS Software &rarr; ODBC Drivers. After downloading the sasodbc.exe click it and install the driver. The installation is straightforward.</p>
<h4>Driver configuration</h4>
<p>After installing the driver open the file C:\WINDOWS\system32\drivers\etc\servises. The file can be in a different folder depending on the OS. Add the following string to the end of the file:</p>
<pre class="brush: text">
shr1        1234/tcp      # My SAS Server
</pre>
<p>The string defines the service name that will listen to the connection to ODBC driver from the tcp protocol, port 1234. The port number should be unique.</p>
<p>To configure the driver:</p>
<ol>
<li>Open the ODBC data source administrator window: Start &rarr; Control Panel &rarr; Administrative tools &rarr; Data Sources (ODBC).</li>
<li>On the User DSN tab click Add.</li>
<li>Choose “SAS” from the list and click Finish.</li>
<li>Data Source Name field: type “SAS ODBC”.</li>
<li>Servers tab: in the Name field type service name: shr1.</li>
<li>Click Configure, OK and Add.</li>
<li>Libraries tab: type any library name in the field Name. The library will be available for clients accessing SAS through ODBC driver. The libraries defined in the autoexec.sas file will be unavailable.</li>
<li>Host File: type the library catalog.</li>
<li>Click ADD, OK, OK.</li>
</ol>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/odbc_admin.png" alt="ODBC Data Source Administrator" title="ODBC Data Source Administrator" width="490" height="402" class="alignnone size-full wp-image-163" /></p>
<h4>Connect Excel to SAS</h4>
<p>The instruction bellow is for Excel 2007 and differs a little for the Excel 2003.</p>
<ol>
<li>Run Excel.</li>
<li>Go to Data tab &rarr; From other Sources &rarr; From Microsoft Query.</li>
<li>Chose “SAS ODBC*” from the list and click OK.</li>
<li>Window with a list of SAS tables will appear after SAS Start.</li>
</ol>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/excel_query_wizard.png" alt="Query Wizard" title="Query Wizard" width="490" height="330" class="alignnone size-full wp-image-164" /></p>
<ol start=5>
<li>Move one of the tables from the left list to the right.</li>
<li>Click Next, Next, Next, Finish, Ok.</li>
<li>The data will appear in Excel.</li>
</ol>
<p>Microsoft Query Wizard also allows building SQL queries and sending it to SAS for execution. One has to choose two tables and after clicking Next the Microsoft Query window will apper.</p>
<p><img src="http://sasbi.net/wp-content/uploads/2010/02/cup98lrn_excel.png" alt="cup98lrn_excel" title="cup98lrn_excel" width="490" height="245" class="alignnone size-full wp-image-165" /></p>
<h4>Add-in for MS Office</h4>
<p>There is a specialized SAS product that allows working with SAS through Excel. It has an advantage over ODBC driver. Add-in for MS Office allows using SAS wizards to run math algorithms in SAS that are not yet available in Excel. But if you don’t use math ODBC driver is the best choice as it is free.</p>
]]></content:encoded>
			<wfw:commentRss>http://sasbi.net/odbc-driver-excel-client/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
