Entity Framework CompiledQuery某bug及workaround

最近的项目里面用到了EF,某个功能需要做大量查询,性能很糟糕。

于是尝试把查询改成CompiledQuery,结果发现使用后性能反而下降了。祭出Profiler,结果发现:

image

有1/3的查询都调用了System.Data.Metadata.Edm.ObjectItemCollection.LoadAllReferencedAssemblies这个方法,时间基本全部耗在上面了。

看名字猜测,这方法似乎是用来加载查询引用的程序集的,但是理论上程序集只需要加载一次而已,感觉是EF内部的缓存机制出了问题。

Reflector反编译之,发现以下代码段:

   1: ' System.Data.Objects.CompiledQuery

   2: Private Function Invoke(Of TArg0 As ObjectContext, TArg1, TArg2, TArg3, TResult)(ByVal arg0 As TArg0, ByVal arg1 As TArg1, ByVal arg2 As TArg2, ByVal arg3 As TArg3) As TResult

   3:     EntityUtil.CheckArgumentNull(Of TArg0)(arg0, "arg0")

   4:     arg0.MetadataWorkspace.LoadAssemblyForType(GetType(TResult), Assembly.GetCallingAssembly)

   5:     Return Me.ExecuteQuery(Of TResult)(arg0, New Object() { arg1, arg2, arg3 })

   6: End Function

   7:  

   8: ' System.Data.Metadata.Edm.MetadataWorkspace

   9: Friend Sub LoadAssemblyForType(ByVal type As Type, ByVal callingAssembly As Assembly)

  10:     Dim items As ItemCollection

  11:     If Me.TryGetItemCollection(DataSpace.OSpace, items) Then

  12:         Dim items2 As ObjectItemCollection = DirectCast(items, ObjectItemCollection)

  13:         If (Not items2.LoadAssemblyForType(type) AndAlso (Not callingAssembly Is Nothing)) Then

  14:             items2.LoadAllReferencedAssemblies(callingAssembly)

  15:         End If

  16:     End If

  17: End Sub

  18:  

  19: ' System.Data.Metadata.Edm.ObjectItemCollection

  20: Friend Function LoadAssemblyForType(ByVal type As Type) As Boolean

  21:     Dim flag As Boolean = ObjectItemCollection.LoadAssemblyFromCache(Me, type.Assembly, False)

  22:     If type.IsGenericType Then

  23:         Dim type2 As Type

  24:         For Each type2 In type.GetGenericArguments

  25:             flag = (flag Or Me.LoadAssemblyForType(type2))

  26:         Next

  27:     End If

  28:     Return flag

  29: End Function

  30:  

  31:  

  32:  

  33: Private Shared Function LoadAssemblyFromCache(ByVal objectItemCollection As ObjectItemCollection, ByVal [assembly] As Assembly, ByVal loadReferencedAssemblies As Boolean) As Boolean

  34:     Dim flag As Boolean

  35:     If ObjectItemCollection.ShouldFilterAssembly([assembly].FullName) Then

  36:         Return False

  37:     End If

  38:     

  39:     ' skipped

  40: End Function

  41:  

  42: Friend Shared Function ShouldFilterAssembly(ByVal fullName As String) As Boolean

  43:     Return fullName.EndsWith("PublicKeyToken=b77a5c561934e089", StringComparison.OrdinalIgnoreCase)

  44: End Function

 

b77a5c561934e089是M$程序集的PublicKeyToken,看到这里基本就明白了。CompiledQuery执行的时候会加载调用方程序集引用的所有程序集,以TResult及其泛型参数作为键来缓存(具体的流程没有深入研究,可能不太准确)。在取缓存的时候如果键类型是M$程序集里面的话就会直接跳过。

但是取缓存的代码有个bug,如果TResult和它的泛型参数全部是M$程序集里面的类型的话,LoadAssemblyForType就会返回False,导致每次查询都会调用LoadAllReferencedAssemblies,大幅拖慢速度。

项目里面的符合条件的查询有2个,刚好是1/3,分别是返回IQueryable(Of Integer)和IQueryable(Of Decimal)。知道原因之后解决方法就很简单了。新建一个容器类如PrimitiveHolder(Of T),然后把CompiledQuery里面的TResult改成容器类就可以了,如IQueryable(Of PrimitiveHolder(Of Integer))和IQueryable(Of PrimitiveHolder(Of Decimal))。

(话说这几天弄项目一直在捣鼓EF,发现问题真的很多。。。果然M$的东西第一个版本都是这样么。。。希望.Net 4.0的EF会有改进。。。

Leave a Reply

Your email address will not be published. Required fields are marked *