This post was moved to my real blog: ArcSDE: Connection to the Geodatabase
This blog is a placeholder for my real blog notes (used to share my drafts).
Please visit Dll Shepherd.Net
Please don't reference any of this posts because they might be moved or deleted!
Sunday, December 18, 2011
MSD tools: Changing Connection and Publishing
This post was moved to my real blog: MSD tools: Changing Connection and Publishing
Wednesday, November 23, 2011
Clearing ArcGIS Server REST API Cache
Tuesday, November 15, 2011
ArcObjects: Workspace Provider
This post was moved to my real blog: ArcObjects: Workspace Provider
Wednesday, July 13, 2011
Wednesday, June 8, 2011
ESRI Silverlight API: Getting Started
This post was moved to my real blog: ESRI Silverlight API: Getting Started
Saturday, June 4, 2011
ArcObjects: Extending the Framework
This post was moved to my real blog: ArcObjects: Extending the Framework
ArcObjects: Getting Started
This post was moved to my real blog: ArcObjects: Getting Started
Geographic Database: ESRI File Formats
This post was moved to my real blog: Geographic Database: ESRI File Formats
Sunday, May 29, 2011
Silverlight: String localization Problems
This post was moved to my real blog: Silverlight: String localization Problems
Thursday, May 26, 2011
Configuring Silverlight Applications
Saturday, May 7, 2011
ArcObjects: Introduction
This post was moved to my real blog: ArcObjects: Introduction
Wednesday, April 27, 2011
Combining WMV Videos
I had some videos downloaded from the internet in WMV format which I wanted to combine to a single file. After trying Windows Movie Maker which did combine my files but make the end result both larger and with less resolution I tried downloading another app “Easy Video Joiner” and wasn’t impressed.
It was half a solution because:
- It’s free – which is always a plus (it used to be order based but the company just grants the registration for free here)
- It joins WMVs and doesn’t reduce quality nor size! - a plus
- It has errors popping up from time to time – a minus
- It closes while joining the files without any error – a minus
- It wasn’t updated since Jun 30, 2003 – which considering all the errors probably mean it has no support – a minus
- The app is very random in its responses meaning running it twice might cause an error once and the next time to succeed, or just drop dead. So no one should delete the source without checking the result first!
So I just wanted to check out how hard will it be to implement a joiner of my own and just looking for managed code for doing that took too long. At the end I found Microsoft Expression Encoder SDK which is supposed to enable this.
//TODO: try it out
Keywords: Encoding video, wmv
Wednesday, April 20, 2011
HTTP Error 503. The service is unavailable.
I was trying to solve a problem with Silverlight/WCF error when I actually got this error:
Someone suggested this solution:
In the IIS Manager\App Pools\Application’s AppPool –> Advanced Settings –> Load User Profile = false:
If it helps you please post a response and I will move the solution to my real blog.
You can still read my tortured experience (though in the end only format solved it):
I was using IISreset, IISreset /stop, IISreset /start like crazy.
Going to the IIS I found one of the AppPool was stopped so I decided to start it:
For like 2 minutes I thought that have been a mistake since it just didn’t complete. But when it completed it was still stopped.
Restart didn’t solve the problem.
Looking at the event log I found one error:
Log Name: System
Source: Microsoft-Windows-WAS
Date: 03/02/2011 15:08:03
Event ID: 5002
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Description:
Application pool 'ASP.NET v4.0' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
And about 5 warnings:
Log Name: System
Source: Microsoft-Windows-WAS
Date: 03/02/2011 15:08:03
Event ID: 5022
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Description:
The Windows Process Activation Service failed to create a worker process for the application pool 'ASP.NET v4.0'. The data field contains the error number.
Now I know I did some changes in the web.config file – so I decided to undo those. And try to start the App Pool again.
I found this error in the Event Viewer: Application Log (it started around the time I started having problems):
Log Name: Application
Source: Microsoft-Windows-User Profiles Service
Date: 03/02/2011 14:36:20
Event ID: 1500
Task Category: None
Level: Error
Keywords:
User: IIS APPPOOL\ASP.NET v4.0
Description:
Windows cannot log you on because your profile cannot be loaded. Check that you are connected to the network, and that your network is functioning correctly.
DETAIL - Only part of a ReadProcessMemory or WriteProcessMemory request was completed.
This is an Error for ITs though…
But the simplest solution did work – I just moved my applications to another App Pool… Until I had to restart my and then got my trouble doubled:
(I am starting to use “temp” named App Pools – God Save Me!)
Hope I won’t need to format this computer…
Format seems to solve this problem (formatted for a different reason)…
Keywords: IIS, Error
Saturday, April 16, 2011
Friday, April 15, 2011
Game Programming with XNA
Part of the Beginner Game Programming in .Net series.
The first step was downloading and installing Microsoft XNA Game Studio 4.0.
Resources:
Keywords: Game programming, XNA
Beginner Game Programming in .Net
This post was moved to my real blog: Beginner Game Programming in .Net
Tuesday, April 12, 2011
LINQ and WCF
I am sure that for most of you this will be back to the basics but I had this exception yesterday:
Type 'System.Linq.Enumerable+WhereSelectEnumerableIterator`2[ClassA,ServiceClassA]' cannot be serialized. Consider marking it with the DataContractAttribute attribute, and marking all of its members you want serialized with the DataMemberAttribute attribute. See the Microsoft .NET Framework documentation for other supported types.
That I couldn’t find a solution for in Google, though I solved it fairly easily.
The situation was this I had a service lets just call it ClassService with this method:
- [ServiceContract]
- public interface IClassService
- {
- void SomeMethod(ServiceEntity entity);
- }
ServiceEntity was defined as:
- [DataContract]
- public class ServiceEntity
- {
- [DataMember]
- public IEnumerable<ServiceClassA> As { get; set; }
- }
- [DataContract]
- public class ServiceClassA { }
Now the client looked something like:
- public IClassService ClassService { get; set; }
- public void CallService(IEnumerable<ClassA> list)
- {
- ClassService.SomeMethod(new ServiceEntity{As = list.Select(Convert)});
- }
- private ServiceClassA Convert(ClassA a)
- {
- return new ServiceClassA();
- }
(just imagine the constructor initializing the ClassService)
Do you see the mistake?
The problem comes from the line:
- ClassService.SomeMethod(new ServiceEntity{As = list.Select(Convert)});
Or most specifically:
- new ServiceEntity{As = list.Select(Convert)}
The problem is that LINQ doesn’t actually executes this query. It wait until you try to do something with it like converting it to a list or try using one of the entities. The IEnumrable actually has the type of 'System.Linq.Enumerable+WhereSelectEnumerableIterator`2[ClassA,ServiceClassA]' which is not a WCF DataCotract.
The solution is fairly easy:
- ClassService.SomeMethod(new ServiceEntity{As = list.Select(Convert).ToList()});
Just use the LINQ query by creating a List out of it is enough.
Keywords: LINQ, WCF, Exception
Saturday, April 9, 2011
Unit Tests: Introduction
This post was moved to my real blog: Unit Tests: Introduction
Tuesday, March 29, 2011
Silverlight: Adding Google Streets
This post was moved to my real blog: Silverlight: Adding Google Streets
Monday, March 28, 2011
Linq2Sql: Changing the Database Schema at Runtime (without XMLs)
Our company has too many databases. Not only we have a database server for development, another ~5 server per version for QA/integration, we also have a production database server for each country we deal with.
I don’t really know why but in our development server we have a database per country (which is valid) with different database schemas: BOB for our regular schema and BOB_JPN for japan. In our DAL (Data Access Layer) code out code that uses IWorkspace takes the DB Schema from an application config file but we also have code that uses Linq2Sql (since it is faster) that has the DB Schema hard coded in the designer.cs code:
- [global::System.Data.Linq.Mapping.TableAttribute(Name="BOB.SOME_TABLE")]
- public partial class SOME_TABLE
- {
- private int _OBJECTID;
- private short _TypeId;
Until today we changed the designer code by deleting the schema and then the query is done with the user default schema, which in japan is BOB_JPN. My team leader has done this “simply” by changing the code in the designer.cs file.
I decided I can’t allow this to continue. I had two options:
- Change the DB Schema in Linq2Sql
- Writing the entity code by hand – without the DB Schema
- Using some other technology (such as ADO.Net) instead of Linq2Sql
- Use my team leader way and change the designer code
I of course preferred using Linq2Sql, simply because the code works and we are going to production soon.
My first Google search “linq2sql config db schema” was a bust .
My second search “change linq mapping in runtime” found this:
External Mapping Reference (LINQ to SQL) – using external XML files, nothing on runtime changes
LINQ to SQL - Tailoring the Mapping at Runtime – again too complicated, it was something like build the xml in runtime and tailor it in…
On my third search I decided to think outside the box (actually I decided to run away from the box): “reflection change attribute at runtime”
Change Attribute's parameter at runtime
My first trial was with the code that was marked as not working (I hoped the bug was fixed since 2008):
- private void ChangeSchema()
- {
- if (DefaultGisConfigSection.Instance.SchemaName.ToUpper().CompareTo("BOB") == 0)
- return;
- ChangeTableAttribute(typeof (STREET));
- }
- private void ChangeTableAttribute(Type table)
- {
- var tableAttributes = (TableAttribute[])
- table.GetCustomAttributes(typeof (TableAttribute), false);
- tableAttributes[0].Name = DefaultGisConfigSection.Instance.SchemaName + "." + TableName;
- }
Didn’t work.
My second trial didn’t work either:
- private void ChangeTableAttribute(Type table)
- {
- TypeDescriptor.AddAttributes(table,
- new TableAttribute
- {Name = DefaultGisConfigSection.Instance.SchemaName + "." + TableName});
- }
But I think this time it’s more my reflection code than anything else. The Exception in both cases was:
Test method Shepherd.Core.Dal.Tests.SomeTest threw exception:
System.Data.SqlClient.SqlException: Invalid object name 'BOB.SOME_TABLE'.at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning()
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.ConsumeMetaData()
at System.Data.SqlClient.SqlDataReader.get_MetaData()
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
at System.Data.Common.DbCommand.ExecuteReader()
at System.Data.Linq.SqlClient.SqlProvider.Execute(Expression query, QueryInfo queryInfo, IObjectReaderFactory factory, Object[] parentArgs, Object[] userArgs, ICompiledSubQuery[] subQueries, Object lastResult)
at System.Data.Linq.SqlClient.SqlProvider.ExecuteAll(Expression query, QueryInfo[] queryInfos, IObjectReaderFactory factory, Object[] userArguments, ICompiledSubQuery[] subQueries)
at System.Data.Linq.SqlClient.SqlProvider.System.Data.Linq.Provider.IProvider.Execute(Expression query)
at System.Data.Linq.DataQuery`1.System.Collections.Generic.IEnumerable<T>.GetEnumerator()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList(IEnumerable`1 source)
For the last try before yelling quits I decided to try and see what Microsoft is doing behind the scenes – I activated the Debug Symbols option (this is something I don’t encourage anyone to try because the last time I did this with the ESRI symbols it just didn’t remove itself for several days and the performance was bad!).
Going over the code I reached this method in C:\Projects\SymbolCache\src\source\.NET\4\DEVDIV_TFS\Dev10\Releases\RTMRel\ndp\fx\src\DLinq\Dlinq\Mapping\AttributedMetaModel.cs\1599186\AttributedMetaModel.cs:
- internal MetaTable GetTableNoLocks(Type rowType) {
It did:
- TableAttribute[] attrs = (TableAttribute[])root.GetCustomAttributes(typeof(TableAttribute), true);
And got the original table name – Now I know what must be changed!
But I can’t since that was my first trial (going into the framework code here just got me into more trouble, at the end of a very long (~200 lines) method:
- [System.Security.SecurityCritical] // auto-generated
- private unsafe static object[] GetCustomAttributes(
- RuntimeModule decoratedModule, int decoratedMetadataToken, int pcaCount,
- RuntimeType attributeFilterType, bool mustBeInheritable, IList derivedAttributes, bool isDecoratedTargetSecurityTransparent)
in C:\Projects\SymbolCache\src\source\.NET\4\DEVDIV_TFS\Dev10\Releases\RTMRel\ndp\clr\src\BCL\System\Reflection\CustomAttribute.cs\1305376\CustomAttribute.cs I found that the attribute is being built by:
Going into that got me this:
in text:
---------------------------
Microsoft Visual Studio
---------------------------
File LoadSome bytes have been replaced with the Unicode substitution character while loading file C:\Projects\SymbolCache\src\source\.NET\4\DEVDIV_TFS\Dev10\Releases\RTMRel\ndp\fx\src\DLinq\Dlinq\Mapping\Attributes.cs\1305376\Attributes.cs with Unicode (UTF-8) encoding. Saving the file will not preserve the original file contents.
---------------------------
OK
---------------------------
The result is an empty TableAttribute, the value for the attribute comes from an unsafe method:
- [System.Security.SecurityCritical] // auto-generated
- [ResourceExposure(ResourceScope.None)]
- [MethodImplAttribute(MethodImplOptions.InternalCall)]
- private unsafe extern static void _GetPropertyOrFieldData(
- RuntimeModule pModule, byte** ppBlobStart, byte* pBlobEnd, out string name, out bool bIsProperty, out RuntimeType type, out object value);
I give up, I should have given up when Google failed me but… I have decided to post it as a question stackoverflow:
Modifying Class Attribute on Runtime
I next decided to change the attribute by inheritance:
- [TableAttribute(Name = "SOME_TABLE")]
- public class SomeTable:SOME_TABLE
That got me the Exception:
System.InvalidOperationException: Data member 'Int32 OBJECTID' of type 'Project.Dal.SOME_TABLE' is not part of the mapping for type 'SomeTable'. Is the member above the root of an inheritance hierarchy?
Basically that is not possible because of a limitation in Linq2Sql this (damn!).
At the end I have chosen option 2, writing the code by hand. But I learned that going into .Net inner code is a trial of sanity, you either find what you are looking for and lose your sanity or you give up…
//TODO: Missing some code in the middle, given the previous paragraph do I really want to look for it?!?
Keywords: Linq2Sql, reflection, DB Schema
Wednesday, March 23, 2011
Debugging Tip
This post was moved to my real blog: Debugging Tip
Starting with MVVM (in Silverlight)
This post was moved to my real blog: Starting with MVVM (in Silverlight)
Sunday, March 20, 2011
ArcObjects: Workspace is Down
This post was moved to my real blog: ArcObjects: Workspace is Down
Wednesday, March 16, 2011
File GeoDatabase: Getting the Workspace
I have created FileWorkspaceUtils that inherits from WorkspaceUtils, it adds the functions GetRows and GetFeatures that return the raw IRow and IFeature data. In WorkspaceUtils I preferred that the low level programmer won’t even know he has something called IRow or IFeature.
- public class FileWorkspaceUtils:WorkspaceUtils
- {
- public FileWorkspaceUtils(IFeatureWorkspace workspace) : base(workspace)
- {
- }
- public List<IRow> GetRows(string tableName)
- {
- var result = new List<IRow>();
- DoActionOnSelectRows(tableName, null, row => result.Add(row.Clone()));
- return result;
- }
- public List<IFeature> GetFeatures(string layerName)
- {
- var result = new List<IFeature>();
- DoActionOnSelectFeatures(layerName, null, feature => result.Add(feature.Clone()));
- return result;
- }
- }
//TODO: Post on the wonder of Extension Methods (row.Clone())
I have added code to WorkspaceProvider so that it will return the FileWorkspaceUtils (independent of File/Personal GeoDatabase):
- private const string PersonalGeoDatabaseFileExtension = ".MDB";
- private const string FileGeoDatabaseFileExtension = ".GDB";
- /// <summary>
- /// Get a File WorkspaceUtils for Personal and File GeoDatabase
- /// </summary>
- /// <param name="filePath"></param>
- /// <returns></returns>
- public FileWorkspaceUtils GetFileWorkspace(string filePath)
- {
- var extension = (Path.GetExtension(filePath) ?? String.Empty).ToUpper();
- if (extension.CompareTo(PersonalGeoDatabaseFileExtension) == 0)
- return CreatePersonalGeoDatabaseWorkspace(filePath);
- if (extension.CompareTo(FileGeoDatabaseFileExtension) == 0)
- return CreateFileGeoDatabaseWorkspace(filePath);
- throw new NotImplementedException("The only supported file types are mdb and gdb. Not: " + extension);
- }
- private FileWorkspaceUtils CreatePersonalGeoDatabaseWorkspace(string filePath)
- {
- AccessWorkspaceFactory workspaceFactory = new AccessWorkspaceFactoryClass();
- var workspace = workspaceFactory.OpenFromFile(filePath, 0);
- return new FileWorkspaceUtils((IFeatureWorkspace)workspace);
- }
- private FileWorkspaceUtils CreateFileGeoDatabaseWorkspace(string filePath)
- {
- FileGDBWorkspaceFactory workspaceFactory = new FileGDBWorkspaceFactoryClass();
- var workspace = workspaceFactory.OpenFromFile(filePath, 0);
- return new FileWorkspaceUtils((IFeatureWorkspace)workspace);
- }
The only problem is it doesn’t work, my unit tests that just check GetFileWorkspace throws a COMException:
Test method CompanyName.GIS.Core.Esri.Tests.WorkspaceProviderTests.GetWorkspace_ValidPersonalGeoDB_GetFileWorkspaceUtils threw exception:
System.Runtime.InteropServices.COMException: Exception from HRESULT: 0x80040228
at ESRI.ArcGIS.DataSourcesGDB.AccessWorkspaceFactoryClass.OpenFromFile(String fileName, Int32 hWnd)
at Core.Esri.WorkspaceProvider.CreatePersonalGeoDatabaseWorkspace(String filePath) in WorkspaceProvider.cs: line 200
at Core.Esri.WorkspaceProvider.GetFileWorkspace(String filePath) in WorkspaceProvider.cs: line 189
at Core.Esri.Tests.WorkspaceProviderTests.GetWorkspace_ValidPersonalGeoDB_GetFileWorkspaceUtils() in WorkspaceProviderTests.cs: line 55
The problem was caused by Licensing, I changed EsriInitilization to contained the old style licensing as well (the one with IAoInitialize, the new stuff is using RuntimeManager):
All my unit tests (427 tests) pass, so it works…
- public class EsriInitilization
- {
- private static bool _isStarted = false;
- public static void Start()
- {
- if (_isStarted)
- return;
- if (!Initialize(ProductCode.Server, esriLicenseProductCode.esriLicenseProductCodeArcServer))
- {
- if(!Initialize(ProductCode.Engine, esriLicenseProductCode.esriLicenseProductCodeEngineGeoDB))
- {
- throw new ApplicationException(
- "Unable to bind to ArcGIS license Server nor to Engine. Please check your licenses.");
- }
- }
- _isStarted = true;
- }
- private static bool Initialize(ProductCode product, esriLicenseProductCode esriLicenseProduct)
- {
- if (RuntimeManager.Bind(product))
- {
- IAoInitialize aoInit = new AoInitializeClass();
- aoInit.Initialize(esriLicenseProduct);
- return true;
- }
- return false;
- }
- }
That still throw an exception, this time simply because IFeature refused to be cloned – though it implemented ESRI’s IClone interface. The error I got was:
//TODO: Write error and new code
//TODO: Post after writing about Extension Method (TODO above)
Resources:
Esri Forum: COM Exception 0x80040228 When Opening a Personal Geodatabase
Keywords: License, COM, exception, IWorkspace, engine, Server, ArcGis, ESRI, Unit tests, MDB, GDB
Saturday, March 12, 2011
Semantic Similarities
For the last year I have been working on my final project for my Masters Degree in Computer Science. My college, the Academic College of Tel-Aviv-Yaffo, doesn’t employ a Thesis but uses a combination of a Final test (with the material of core subjects from both the Bachelor and the M.Sc. degrees) and a final project worked on with one of the Doctors/Professors in my college. The project I am working on is on semantic similarities with Professor Gideon Dror and I am nearly done, all that is left is to present my work in front of my professor and a faculty member.
I have decided to first present my work here and then actually do the presentation.
The project was done mostly in Python (which before hand I had no knowledge of) and it’s first part was done as a possible contribution to the NLTK library.
The first part of the project was about implementing methods to find semantic similar words using an input triplets of Context Relation. Context Relation triplet are two words and their relation to each other extracted from a sentence.It was shocking to find in the end that NLTK hasn’t implemented a way to extract Context Relations from a text (they have a few demos done by human hand) and it seems that to implement this a knowledge linguistics that I just don’t posses.
The second part of the project was to extract the Semantic Similarities of words from the web site Yahoo Answers. The idea is that with enough data extracted from different categories an algorithm can be used to determine the distance of the words.
On to the presentation:
For this discussion we will ignore the part “without being identical”. In this project identical is included in similar.
Are Horse and Butterfly similar? The first response should be of course NO, but of course it depends comparing horse to butterfly to house reveals that horse and butterfly are similar it just depends on the context…
Likewise comparing a horse to a zebra the response would be YES. But looking at a sentence such as:
The nomad has ridden the ____
and looking at horse, zebra and camel which is more similar in this context?
This time the only similarity in these words are the way they are written and pronounced. Their context relation should be very dissimilar no matter the text. But imagine using a naive algorithm that only counts the number of words in a text, is it really inconceivable to have a close number of occurrences of these words?
Humans use similarities to learn new things, a zebra is similar to a horse with stripes. But it is also used as a tool for our memory, in learning new names it helps to associate the name with the person by using something similar. For example to remember the name Olivia it could be useful to imagine that person with olive like eyes.
In software the search engine use similar words to get greater results, for example a few days ago I searched for a driver chair and one of the top results was a video of a driver seat.
Possible future uses for similar words could be in AI software. There is a yearly contest named the Loebner Prize for “computer whose responses were indistinguishable from a human's”. If we could teach a computer a baseline of sentences and then advance it by using similar words (like the learning of humans) it could theoretically be “indistinguishable from a human's”.
Imagine having the AI memorize chats, simply by extracting chats in Facebook or Twitter. Then have the AI extend those sentences with similar words. For example, in a real chat:
- Have you fed the dog?
Could be extended to:
- Have you fed the snake?
(some people do have pet snakes… and I can imagine a judge trying to trip an AI with this kind of question…)
A simple definition is if we had a sentence containing word and we replaced word with word’ and the sentence is still correct the words are Semantic Similar. From now on Similarity is actually Semantic Similarity.
From the examples we can see that Similarity is all about Context, Algorithm and Text.
As we could see in the examples the Context of the words makes a large difference whether or not two words are similar. Unlike Algorithm and Text, it has nothing to do with the implementation of finding the similarity.
Some Algorithm use Context Relation to give value to the context in which the words are in. Extracting Context Relation from text is a very complicated task and has yet to have an implementation in NLTK, the library does have a couple of examples that were created by human means.
Looking at the all the words with the distance of 4 words from the word Horse. One of the Algorithms we will examine use this as a simpler Context aspect for the Algorithm.
Another form of Context extraction is separating the text based on category. Then each category adds a different Similarity value and those can be added together.
Algorithms that ignore the Context of the word are therefore less accurate than those that do but they are also more complex. It can be simply because they use Context Relation (with it’s complex extraction) or using a words radios which just mean individual work for each word – more complexity.
All the Algorithms use some form of counting mechanism to determine the Similarity/Distance between the words.
Depending on the Algorithm a different scoring is done for each word. The the Algorithm determines how to convert that score into the Distance between the words, which just means calculating the Similarity.
Text is a bit misplaced here because it is a part of the Context and is used inside the Algorithms. Choosing the right text therefore is as essential a part as choosing the right Algorithm.
But imagine a text that contain only the words:
This is my life. This is my life…
All the practical Algorithms shown here will tell you that “this” and “life” are Similar words – based on this text alone.
In my second implementation of Similarity Algorithms I used extracted text from several categories of Yahoo Answers. Yahoo Answers is a question+answer repository that contains thousands of questions and answers. For my Algorithms I had to extract 2GB of data from the site (just so I had enough starting data).
The Algorithms can be separated to two groups: those that use Context Relation (and therefore until an extractor for Context Relation is implemented are purely theoretical), and those that use Category Vector as a form of Context for the words.
All the Context Relation Algorithms use this two inner classes: Weight and Measure. Weight is the inner class that give a score for the Context Relation, the Weight is important since a Context Relation that appears only once in a text should not have the same score as one that appeared ten times. The Measure inner class calculates the distance between two words using the Weight inner class. Using only this classes the user can be given a Similarity value of two words.
The Algorithms in this section implement a near-neighbor searches. We use them to find the K most similar words in the text not just how similar the words are.
Taken from James R. Curran (2004)-From Distributional to Semantic Similarity
In my Theoretical work I implemented some of the inner classes of Weight and Measure from James R. Curran paper From Distributional to Semantic Similarity.
I am not going to go into lengthy discussion on how they work because the paper discusses all of this.
I am going to say that the Similarities turn out different for each combination of Weight X Measure and that it is fairly easy to set a combination up or to implement a new Weight/Measure class.
The classes I choose to implement taken from Scaling Distributional Similarity to Large Corpora, James Gorman and James R. Curran (2006). This Classes are used to find the K most similar words to a given word.
The simplest algorithm is a brute force one. First we calculate the Distance Matrix between our word and all the rest of the words in the text and then we search for the K most Similar words.
The disadvantage for this algorithm is that calculation for finding the K-nearest words for “hello” can’t be reused for the word “goodbye” (actually only one calculation can be reused here and that is between “hello” and “goodbye”).
I am not going to go into the other implementations here since they are more complex. I might write another post in the future about those algorithms.
If you interested the Python implementation can be found here (or you can just read Scaling Distributional Similarity to Large Corpora).
There are two practical Algorithms that I have implemented.
This simple algorithm is very fast and can be preprocessed for even faster performance. By simple saving the count of each word per category, the Algorithm can be made as fast as reading the preprocessed file. In small examples of just 50MB data the Algorithm took only a few seconds to extract a result. Using the full data of 2GB it takes ~10 minutes to have a result for ~350 pairs of compare words. Though because of the large amount of data the data must be opened in chunks (a chunk per category) or an Out of Memory Exception is thrown.
The end of the Algorithm is identical to the first Algorithm but where the words radios Algorithm has clearly more vectors. Not only that preprocessing of this data is both time consuming (takes ~5 days) but also space consuming (from 2GB to 15GB) – just preprocessing the data caused at least 10 Out of Memory Exceptions (Python doesn’t have an automatic Garbage Collection so after every category I had to call gc.Collect() manually).
The calculation time for ~350 pairs of compare words was ~25 hours, which of course can’t be used in real time AI conversations. Though with the preprocessing it doesn’t matter if there are 350 or 35k words to compare – it will take approximately the same time. For example three categories of ~120MB with ~350 pairs take ~56 minutes but 3 pairs take ~30 minutes.
It’s important to note that both Algorithms have close result, for example bread,butter has a Similarity of 0.56 which is pretty high.
As can be seen the Basic has almost always greater result than Words Radios. Not only that Basic has some weird result such as Maradona is more Similar to football than Soccer though in many places (not USA) use them as synonyms, whereas Words Radios seem to think soccer is more similar to football.
Since the Words Radios actually uses a form of Context Relation (though not very lexical) it is considerably more accurate.
Remember I claimed it was all about the text? well this results were done with just a few categories and suddenly Arafat is similar to Jackson, how weird is that?
Another difference is the calculation time the Simple Algorithm takes ~17 seconds where the Words Radios Algorithm takes ~56 minutes.
BTW remember night and knight? Well the simple algorithm returned 0.79 Similarity for those 3 categories… And the Words Radios returned 0.48 Similarity.
So do you have any questions? Suggestion? Too long? Too short?
Tell me what you think…
Keywords: similarity, NLTK, search, AI
Friday, February 25, 2011
Silverlight 4
Things I got from watching the videos in TekPub on Silverlight
<ItemsControl ItemsSource="{Binding}"
does a data binding to the current DataContext, ItemsControl is a list of items
<ItemsControl.ItemTemplate>
specify how the items are going to be shown, binding here will be to the inside items
Margin
Attribute for how far from the outside is the element
Padding
Attribute for how is the distance from the context inside of it to the edge
Binding has a StringFormating good for date fields
<UserControl … d:DataContext="{d:DesignInstance Type=local:DemoClass, IsDesignTimeCreatable=True}"
Does a databinding in design time to see how the control will look like
Markup Extension has a default property, binding has Path:
{Binding Path=Hello} is the same as {Binding Hello}
Attached Properties, for example in grid: (ClassName.Property)
<Button Grid.Column=1
x:Name or Name are interchangeable when there is a Name property (one sets the other). You should use the x:Name because it always exists.
Canvas specify the exact location of the controls from the top and the left, by Canvas.Top and Canvas.Left. CAnvas.ZIndex sets which control is top/bottom, a higher value renders on top. Control can be set outside of the canvas.
Grid
Height = * (Weighted proportion can be 2* or 2.5*) / 50 (fixed value) / Auto (collapse to the smallest possible height)
the default is *
HorizontalAlignment: Left, Center, Right, Stretch. Center is the same as Stretch when the width is set
VerticalAlignment: Top, Center, Bottom, Stretch. Center is the same as Stretch when the heightis set
StackPanel no attached propertiies
Orientation – Vertical (default), Horizontal
Silverlight Toolkit: Silverlight.codeplex.com
DockPanel
toolkit:DockpanelDock=”Top” will cause the control to dock to the top
WrapPanel like the StackPanel but wraps the controls to the next row/column
Controls and Dialog Boxes – MSDN page that lists almost all the official controls
Sources: Runtime, SDK, Toolkit
Border Control (Runtime without a namespace) – adds a border with color, margin, padding.
Has only one child element.
ScrollViewer add a scroll to content when needed. The horizontal is by default disabled.
ViewBox the default stretches the child elements.
StretchDirection: UpOnly (larger only), DownOnly (small only), both
GridSplitter (SDK) dynamically change the size of the grid column/rows
Grid.Column tells it which columns to effect is
It overlaps the controls, so you should add a column for the splitter and add HorizontolAlligment=”Center”
ChildWindow show a model popup
need to add Silverlight Child Window control
Has a Show method to show the child window
Open/save file
var a = new OpenFileDialog();
a.MultiSelect
a.Filter = “(*.cad)|*.cad;*.cdw|(images) ;*.bmp”;
a.ShowDialog(); returns true/false
var result = a.File;
a.Files (when Multiselect = true)
var b = SaveFileDialog();
b.ShowDialog();
(not mine taken from the presentation)
Change Notifications:
INotifyPropertyChanged – notify the UI that a property was changed
Passing an empty string gives a notification that everything was changed
Value Convertor:
bool to visibility
IValueConvertor
Convert has a parameter
Mode:
OneTime (take first binding), OneWay (default readonly), TwoWay (changes is from the control and source)
Binding to other controls:
{Binding ElementName=ControlName, Path=PropertyName}
Building Business Applications
WCF RIA Services, Project Building Business Applications
Asset – Graphic styles, resources
Models – domain models
Views – screens in SL
GlobalSuppressions.cs – turn off some of the code analysis that is on by default
Web - Models\Shared code compile on both projects (In it is under Genereated_Code hidden folder)
Add EntityFramework of the DB in the Web Project – name myEf
Add Domain Service class to the Web Project
<ds:DomainDataSource x:Name="blahData" QueryName="BlahQuery">
<ds:DomainDataSource.DomainContext>
<my: MyEfDomainContext />
<sdk:DataGrid ItemSource="{Binding Data, ElementName=blahData}"
<sdK:DataPager Source="{Binding Data, ElementName=blahData}" PageSize="30"/>
The data in the service needs to be ordered before doing this
Building Business Applications, part 2
DataFormControl
Navigation Framework -
<navigation:Frame>
<navigation:Frame.UriMapper>
<uriMapper:UriMapper>
<uriMapper:UriMapping Uri="" MappedUri="/Views/Home.xaml"/> <—Default page
<uriMapper:UriMapping Uri="/{pageName}" MappedUri="/Views/{pageName}.xaml"/> <—Default page
</uriMapper:UriMapper>
</navigation:Frame.UriMapper>
</navigation:Frame>
<LinkButton NavigateUri="/Customer" …/>
Navigates to /Views/Customer.xaml page
Ctrl+K, Ctrl+D To reformat the XAML
Using the CustomDataForm (created by the project template)
<c:CustomDataForm AutoGenerateFields="True" ItemsSource="{Binding Data, ElementName=blahData}"
EditEnded="Event_a" CommandButtonsVisibility="All"/>
Should be used with Pager, show by default read only fields, must fields
Event_a(…){
if(e.EditAction == DataFormEditAction.Commit ){
if(!CustomerData.IsSubmittingChanges) CustomerData.SubmitChanges();
}}
Data Annotations in Web:
XDomainService.metadata.cs
Possible Attributes:
[Required(ErrorMessage="you…")]
[StringLength]
[RegularExpression]
[Range]
[EnumDataType] – must be a value in the enum
[CustomValidation]
[Display] – the text of the field
[Editable] – readonly
Setting up the wanted fields:
<c:CustomDataForm AutoGenerateFields="True" ItemsSource="{Binding Data, ElementName=blahData}" >
<c:CustomDataForm.EditTemplate>
<DataTemplate>
<StackPanel>
<TextBox Text="{Binding Company, Mode=TwoWay}"/>
</StackPanel>
</DataTemplate>
</c:CustomDataForm.EditTemplate>
</c:CustomDataForm>
Add/Delete is built in
Delete doesn’t commit out of the box, the pattern seen is an extra button to submit all the changes
BusyIndicator – when busy any control inside the control are readonly and a busy pop up is being shown
Wrap all the controls in a BusyIndicator, one of the controls the template creates
<c:BusyIndicator BusyContent="Loading…" IsBusy="{Binding IsLoadingData, ElementName=InvoiceData}">
Sunday, February 20, 2011
Rebuild All failed without any errors (Silverlight), Part 2
A bit of an update from part 1.
I decided to do a clean of the solution (plus manually delete the XAP from the web project) and then a build and got this in the build:
MSBUILD : error : Xap packaging failed. Exception of type 'System.OutOfMemoryException' was thrown.
So the first thing I decided to do was open a new Visual Studio 2010 and attach it to the old instance of VS2010. Then clean –> rebuild This time I got a bit more:
…
'devenv.exe' (Managed (v4.0.30319)): Loaded 'C:\Program Files\Microsoft SDKs\Expression\Blend\Silverlight\v4.0\Libraries\System.Windows.Interactivity.dll'
'devenv.exe' (Managed (v4.0.30319)): Loaded 'c:\Program Files\Reference Assemblies\Microsoft\Framework\Silverlight\v4.0\System.Xml.dll'
'devenv.exe' (Managed (v4.0.30319)): Loaded 'C:\Program Files\MSBuild\Microsoft\Silverlight\v4.0\XamlServices.dll'
A first chance exception of type 'MS.Internal.Xaml.XamlTypeResolutionException' occurred in XamlServices.dll
A first chance exception of type 'MS.Internal.Xaml.XamlTypeResolutionException' occurred in XamlServices.dll… Just imagine 600 more lines like those 2
A first chance exception of type 'MS.Internal.Xaml.XamlTypeResolutionException' occurred in XamlServices.dll
A first chance exception of type 'System.OutOfMemoryException' occurred in Microsoft.Silverlight.Build.Tasks.dll
A first chance exception of type 'System.OutOfMemoryException' occurred in Microsoft.Silverlight.Build.Tasks.dll
A first chance exception of type 'System.ArgumentException' occurred in Microsoft.VisualStudio.ORDesigner.Dsl.dll
…Another repeat performance this time only 10 linesA first chance exception of type 'System.ArgumentException' occurred in Microsoft.VisualStudio.ORDesigner.Dsl.dll
A first chance exception of type 'JetBrains.Util.Assertion.AssertionException' occurred in JetBrains.Platform.ReSharper.ProjectModel.dll
A first chance exception of type 'JetBrains.Util.Assertion.AssertionException' occurred in JetBrains.Platform.ReSharper.ProjectModel.dll
A first chance exception of type 'System.InvalidOperationException' occurred in JetBrains.Platform.ReSharper.VSIntegration.dll
A first chance exception of type 'System.InvalidOperationException' occurred in JetBrains.Platform.dotTrace.VSIntegration.dll…Seems like dotTrace and Resharper are playing a tag war with Exceptions and my system resources…
At this point I decided to do two things:
- Uninstall dotTrace Performance since I don’t really use it
- Use dotTrace Memory profiler on Visual Studio 2010 and see where the hell all my memory is being wasted… unfortunately that failed me since dotTrace Memory needed more memory than I had.
The next thing I decided to do is uninstall Resharper 6.0 EAP and work without Resharper, well no problems for a day but also less work.
At this point I decided to try working with Resharper 5.1 and the problem returned it just took VS ~7 hours to fail my build. Looking at the clean output this time got me:
========== Clean: 20 succeeded, 10 failed, 0 skipped ==========
Looking at the debug output of the clean, I had this errors:
A first chance exception of type 'System.ArgumentException' occurred in Microsoft.VisualStudio.ORDesigner.Dsl.dll (188 times)
A first chance exception of type 'System.IO.DirectoryNotFoundException' occurred in mscorlib.dll (2 times)
'devenv.exe' (Managed (v4.0.30319)): Loaded 'C:\Program Files\MSBuild\Microsoft\Silverlight\v4.0\XamlServices.dll' (4 times)
'devenv.exe' (Managed (v4.0.30319)): Loaded 'c:\Program Files\Reference Assemblies\Microsoft\Framework\Silverlight\v4.0\System.Windows.dll' (4 times)
A first chance exception of type 'MS.Internal.Xaml.XamlNamespaceException' occurred in XamlServices.dll (23 times)
A first chance exception of type 'System.IO.FileNotFoundException' occurred in mscorlib.dll (20 times)
A first chance exception of type 'MS.Internal.Xaml.XamlTypeResolutionException' occurred in XamlServices.dll (712 times)A first chance exception of type 'JetBrains.Application.Progress.ProcessCancelledException' occurred in JetBrains.Platform.ReSharper.Shell.dll (2 times)
A first chance exception of type 'JetBrains.Metadata.Utils.PE.MetadataReaderException' occurred in JetBrains.Platform.ReSharper.Metadata.dll (4 times)
A first chance exception of type 'System.ArgumentException' occurred in JetBrains.Platform.ReSharper.Metadata.dll (7 times)
A first chance exception of type 'JetBrains.Application.Progress.ProcessCancelledException' occurred in JetBrains.ReSharper.Daemon.dll (5 times)
The thread 'Cache.AddAssembly #0' (0x2038) has exited with code 0 (0x0). (1 times)
The thread 'Cache.AddAssembly #1' (0x23f8) has exited with code 0 (0x0). (1 times)
A first chance exception of type 'Microsoft.Build.Shared.InternalErrorException' occurred in Microsoft.Build.dll (26 times)'devenv.exe' (Managed (v4.0.30319)): Loaded '...' (41 times)
The thread '<No Name>' (0x####) has exited with code 0 (0x0). (16 times)
---Cleaned Failed Message---
A first chance exception of type 'JetBrains.ReSharper.Psi.Xaml.Parsing.MarkupExtensionsTreeBuilder.IsTextException' occurred in JetBrains.ReSharper.Psi.Xaml.dll (29 times)
I am pretty sure I have problems in my xaml but why and where? I believe the Xaml has troubles being parsed but it makes do with what it has (by throwing exceptions which the compiler ignores). I don’t understand why the application works though…
So while the debugger is still attached I tried opening a Xaml file, got my errors:
A first chance exception of type 'System.ArgumentException' occurred in mscorlib.dll (2 times)
A first chance exception of type 'System.BadImageFormatException' occurred in mscorlib.dll (9 times)
A first chance exception of type 'System.IO.FileNotFoundException' occurred in mscorlib.dll (1 times)
'devenv.exe' (Managed (v4.0.30319)): Loaded '...' (6 times)
A first chance exception of type 'System.ArgumentOutOfRangeException' occurred in mscorlib.dll (3 times)
It took me a while but I got the BadImageFormatException details:
Message: Could not load file or assembly 'System.Core.ni.dll' or one of its dependencies. An attempt was made to load a program with an incorrect format.
Stack Trace: at System.Reflection.AssemblyName.nGetFileInformation(String s)
at System.Reflection.AssemblyName.GetAssemblyName(String assemblyFile)
Could not load file or assembly 'System.Net.ni.dll' or one of its dependencies. An attempt was made to load a program with an incorrect format.
And also for: 'System.ni.dll', 'System.Runtime.Serialization.ni.dll', 'System.ServiceModel.ni.dll', 'System.ServiceModel.Web.ni.dll', 'System.Windows.Browser.ni.dll', 'System.Windows.ni.dll', 'System.Xml.ni.dll'.
Thank God it didn’t search for Knight.ni.dll… (it seems I am not the first developer with this joke…)
The FileNotFoundException was caused by:
Message: Could not load file or assembly 'System.debug.resources, Version=2.0.5.0, Culture=en-US, PublicKeyToken=7cec85d7bea7798e' or one of its dependencies. The system cannot find the file specified.
FusionLog:
=== Pre-bind state information ===
LOG: User = (my user)
LOG: DisplayName = System.debug.resources, Version=2.0.5.0, Culture=en-US, PublicKeyToken=7cec85d7bea7798e
(Fully-specified)
LOG: Appbase = file:///c:/Program Files/Microsoft Visual Studio 10.0/Common7/IDE/
LOG: Initial PrivatePath = NULL
Calling assembly : (Unknown).
===
LOG: This bind starts in default load context.
LOG: Using application configuration file: c:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE\devenv.exe.Config
LOG: Using host configuration file:
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v4.0.30319\config\machine.config.
LOG: Post-policy reference: System.debug.resources, Version=2.0.5.0, Culture=en-US, PublicKeyToken=7cec85d7bea7798e
...
Stack Trace:
at System.Reflection.RuntimeAssembly._nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks)
at System.Reflection.RuntimeAssembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, RuntimeAssembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection, Boolean suppressSecurityChecks)
And twice for:
Could not load file or assembly 'file:///C:\Projects\Company.GIS.GISApp\Bin\Company.GIS.GISApp.dll' or one of its dependencies. The system cannot find the file specified.
(this is my Web application that host the XAP file)
I decided going over my XAML files with the designer opened and fix the errors:
- Cannot register duplicate Name '…' in this scope. Fixed using this. The reason behind this can be found here.
- The file '/FolderName/ControlNameResources.xaml' is not part of the project or its 'Build Action' property is not set to 'Resource'. Fixed this by giving the correct path: '../FolderName/ControlNameResources.xaml'
- Cannot find a Resource with the Name/Key WebConfig [Line: 53 Position: 37]. Now this was in a control that was already fixed
- Element is already the child of another element. [Line: 0 Position: 0]. Again the same control as in 3,
- System.InvalidOperationException
An unhandled exception was encountered while trying to render the current silverlight project on the design surface. To diagnose this failure, please try to run the project in a regular browser using the silverlight developer runtime.
at Microsoft.Windows.Design.Platform.SilverlightViewProducer.OnUnhandledException(Object sender, ViewUnhandledExceptionEventArgs e)
at Microsoft.Expression.Platform.Silverlight.SilverlightPlatformSpecificView.OnUnhandledException(Object sender, ViewUnhandledExceptionEventArgs args)
at System.EventHandler`1.Invoke(Object sender, TEventArgs e)
at Microsoft.Expression.Platform.Silverlight.Host.SilverlightImageHost.<>c__DisplayClass1.<Application_UnhandledException>b__0(Object o)
at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
System.Exception
Error HRESULT E_FAIL has been returned from a call to a COM component.
at MS.Internal.XcpImports.CheckHResult(UInt32 hr)
at MS.Internal.XcpImports.SetValue(IManagedPeerBase obj, DependencyProperty property, String s)
at MS.Internal.XcpImports.SetValue(IManagedPeerBase doh, DependencyProperty property, Object obj)
at System.Windows.DependencyObject.SetObjectValueToCore(DependencyProperty dp, Object value)
at System.Windows.DependencyObject.SetEffectiveValue(DependencyProperty property, EffectiveValueEntry& newEntry, Object newValue)
at System.Windows.DependencyObject.UpdateEffectiveValue(DependencyProperty property, EffectiveValueEntry oldEntry, EffectiveValueEntry& newEntry, ValueOperation operation)
at System.Windows.DependencyObject.SetValueInternal(DependencyProperty dp, Object value, Boolean allowReadOnlySet)
at System.Windows.DependencyObject.SetValueInternal(DependencyProperty dp, Object value)
at System.Windows.DependencyObject.SetValue(DependencyProperty dp, Object value)
at System.Windows.ResourceDictionary.set_Source(Uri value)
at BetterPlace.OBS.GIS.MapControl.LoadFeatureLayers() in C:\Projects\Company.GIS.MapControl\MapControl.cs:line 688
at Company.GIS.MapControl.OnApplyTemplate() in C:\Projects\Company.GIS.MapControl\MapControl.cs:line 163
at System.Windows.FrameworkElement.OnApplyTemplate(IntPtr nativeTarget)
The other option is that I have problems with exceptions being thrown in VS , see Tess’s Blog Post.
Found this:
http://willg.co.uk/default.aspx
Could be useful
I had hoped the Visual Studio 2010 SP1 will solve this problem – no such luck.
Hopefully installing the whole system from scratch will solve the problem (I have 4GB of memory and windows 7 32Bit and already wanted to upgrade to 64bit). I am not posting this on Microsoft forums because if doesn’t happen on the new system I won’t be able to post debugger dumps and logs that they will surely need.
Keywords: VS2010, build, error